我有一个分组变量'id'和一个缺少值的日期列的数据:
id time date
a 1 2004-01-13
a 2 2004-05-04
a 3 NA
a 4 2007-03-20
a 5 NA
b 1 2004-01-11
b 2 2004-05-04
b 3 NA
b 4 2006-10-10
b 5 NA
c 1 2004-05-23
c 2 2004-10-14
c 3 NA
c 4 NA
c 5 NA
在每个"id"中,我想找到每对连续日期之间的差异:
id time date difftime
a 1 2004-01-13 NA
a 2 2004-05-04 (2004-05-04)-(2004-01-13)
a 3 NA NA
a 4 2007-03-20 (2007-03-20)-(2004-05-04)
a 5 NA NA
b 1 2004-01-11 NA
b 2 2004-05-04 (2004-05-04)-(2004-01-11)
b 3 NA NA
b 4 2006-10-10 (2006-10-10)-(2004-05-04)
b 5 NA NA
c 1 2004-05-23 NA
c 2 2004-10-14 (2004-10-14)-(2004-05-23)
c 3 NA NA
c 4 NA NA
c 5 NA NA
我试了这些密码,但没有一个能得到我想要的。
data$difftime <- aggregate(date ~ id, data, diff)
library(data.table)
setDT(data)[ , difftime := diff(data$date), by = id]
diff(data$date)
希望这个data.table
选项能有所帮助
setDT(df)[
,
difftime := replace(
rep(NA, .N),
which(!is.na(date))[-1],
diff(na.omit(date))
),
id
]
或更短的(感谢@Henrik)
setDT(df)[!is.na(date), difftime := c(NA, diff(date)), id]
,
id time date difftime
1: a 1 2004-01-13 NA
2: a 2 2004-05-04 112
3: a 3 <NA> NA
4: a 4 2007-03-20 1050
5: a 5 <NA> NA
6: b 1 2004-01-11 NA
7: b 2 2004-05-04 114
8: b 3 <NA> NA
9: b 4 2006-10-10 889
10: b 5 <NA> NA
11: c 1 2004-05-23 NA
12: c 2 2004-10-14 144
13: c 3 <NA> NA
14: c 4 <NA> NA