我有一个data.table,它有多个日期列。这些是K
日期列,因此K
更改。我已经计算了列之间的时间差:
K <- numberOfYears
dateCols = c("fromdatenext", paste0("fromdate" , 1:K))
# create formulas dynamically
all_operations = lapply(seq_len(length(dateCols) - 1), function(i){
as.formula(paste("~difftime(", dateCols[i + 1], ",", dateCols[i],", units = c('weeks'))"))
})
df %>%
mutate_(.dots = setNames(all_operations, paste0("Diff", seq_len(length(dateCols) - 1))))
并得到类似的东西:
fromdatenext fromdate1 fromdate2 fromdate3 fromdate4 Diff1 Diff2 Diff3 Diff4
1 2018-01-01 2017-01-01 2016-01-01 2015-01-01 2014-01-01 -52.14286 weeks -52.28571 weeks -52.14286 weeks -52.14286 weeks
2 2018-10-01 2017-10-01 2016-10-01 2015-10-01 2014-10-01 -52.14286 weeks -52.14286 weeks -52.28571 weeks -53.14286 weeks
3 2018-09-08 2017-09-08 2016-09-08 2015-09-08 2014-09-08 -52.14286 weeks -52.14286 weeks -52.28571 weeks -52.14286 weeks
4 2018-09-22 2017-09-22 2016-09-22 2015-09-22 2014-09-22 -52.14286 weeks -52.14286 weeks -52.28571 weeks -52.14286 weeks
5 2018-05-01 2017-05-01 2016-05-01 2015-05-01 2014-05-01 -52.14286 weeks -52.14286 weeks -52.28571 weeks -52.14286 weeks
6 2018-01-01 2017-01-01 2016-01-01 2015-01-01 2014-01-01 -50.14286 weeks -52.28571 weeks -52.14286 weeks -52.14286 weeks
现在我需要删除所有时间间隔不相等的行。如何动态完成此操作,以便在K
时它仍然有效?
所以期望的结果将是:
fromdatenext fromdate1 fromdate2 fromdate3 fromdate4 Diff1 Diff2 Diff3 Diff4
1 2018-01-01 2017-01-01 2016-01-01 2015-01-01 2014-01-01 -52.14286 weeks -52.28571 weeks -52.14286 weeks -52.14286 weeks
3 2018-09-08 2017-09-08 2016-09-08 2015-09-08 2014-09-08 -52.14286 weeks -52.14286 weeks -52.28571 weeks -52.14286 weeks
4 2018-09-22 2017-09-22 2016-09-22 2015-09-22 2014-09-22 -52.14286 weeks -52.14286 weeks -52.28571 weeks -52.14286 weeks
5 2018-05-01 2017-05-01 2016-05-01 2015-05-01 2014-05-01 -52.14286 weeks -52.14286 weeks -52.28571 weeks -52.14286 weeks
一般来说,我认为这种类型的数据(结构相同/目的列的可变数量(在"长"格式中可能会做得更好,在这种情况下,我建议你的方法应该是:
library(dplyr)
library(tidyr) # pivot_*
dat2 <- dat %>%
mutate(rn = row_number()) %>%
pivot_longer(matches("fromdate[0-9]+"), names_to = "num", values_to = "fromdate") %>%
group_by(rn) %>%
mutate(
fromdatenext = as.Date(fromdatenext),
fromdate = as.Date(fromdate),
num = gsub("\D", "", num),
Diff = `units<-`(diff(c(first(fromdatenext), fromdate)), "weeks")
) %>%
ungroup()
dat2
# # A tibble: 24 x 5
# fromdatenext rn num fromdate Diff
# <date> <int> <chr> <date> <drtn>
# 1 2018-01-01 1 1 2017-01-01 -52.1429 weeks
# 2 2018-01-01 1 2 2016-01-01 -52.2857 weeks
# 3 2018-01-01 1 3 2015-01-01 -52.1429 weeks
# 4 2018-01-01 1 4 2014-01-01 -52.1429 weeks
# 5 2018-10-01 2 1 2017-10-01 -52.1429 weeks
# 6 2018-10-01 2 2 2016-10-01 -52.1429 weeks
# 7 2018-10-01 2 3 2015-10-01 -52.2857 weeks
# 8 2018-10-01 2 4 2014-10-01 -52.1429 weeks
# 9 2018-09-08 3 1 2017-09-08 -52.1429 weeks
# 10 2018-09-08 3 2 2016-09-08 -52.1429 weeks
# # ... with 14 more rows
但是,如果您希望/需要它们以更宽的格式,我们可以重新透视回宽格式:
dat2 %>%
pivot_wider(
id_cols = rn:fromdatenext,
names_from = num,
values_from = c(fromdate, Diff),
names_sep = ""
)
# # A tibble: 6 x 10
# rn fromdatenext fromdate1 fromdate2 fromdate3 fromdate4 Diff1 Diff2 Diff3 Diff4
# <int> <date> <date> <date> <date> <date> <drtn> <drtn> <drtn> <drtn>
# 1 1 2018-01-01 2017-01-01 2016-01-01 2015-01-01 2014-01-01 -52.1429 weeks -52.2857 weeks -52.1429 weeks -52.1429 weeks
# 2 2 2018-10-01 2017-10-01 2016-10-01 2015-10-01 2014-10-01 -52.1429 weeks -52.1429 weeks -52.2857 weeks -52.1429 weeks
# 3 3 2018-09-08 2017-09-08 2016-09-08 2015-09-08 2014-09-08 -52.1429 weeks -52.1429 weeks -52.2857 weeks -52.1429 weeks
# 4 4 2018-09-22 2017-09-22 2016-09-22 2015-09-22 2014-09-22 -52.1429 weeks -52.1429 weeks -52.2857 weeks -52.1429 weeks
# 5 5 2018-05-01 2017-05-01 2016-05-01 2015-05-01 2014-05-01 -52.1429 weeks -52.1429 weeks -52.2857 weeks -52.1429 weeks
# 6 6 2018-01-01 2017-01-01 2016-01-01 2015-01-01 2014-01-01 -52.1429 weeks -52.2857 weeks -52.1429 weeks -52.1429 weeks
数据:
dat <- read.table(header = TRUE, text = "
fromdatenext fromdate1 fromdate2 fromdate3 fromdate4
2018-01-01 2017-01-01 2016-01-01 2015-01-01 2014-01-01
2018-10-01 2017-10-01 2016-10-01 2015-10-01 2014-10-01
2018-09-08 2017-09-08 2016-09-08 2015-09-08 2014-09-08
2018-09-22 2017-09-22 2016-09-22 2015-09-22 2014-09-22
2018-05-01 2017-05-01 2016-05-01 2015-05-01 2014-05-01
2018-01-01 2017-01-01 2016-01-01 2015-01-01 2014-01-01")