我正在尝试找到两个系列之间的滞后。假设有一个可变的temp2,其值滞后于temp1,而滞后不变。
library(data.table)
dt <- data.table(
datetime = seq(as.POSIXct("2000-01-01 00:00:00"),as.POSIXct("2000-01-01 09:00:00"), by = "1 hour"),
temp1 = seq(30, 21, by = -1),
temp2 = c(30, seq(30, 25, by = -1), seq(25, 23, by = -1))
)
我想让一个额外的列"滞后"等于temp1和temp2之间的滞后,以使结果看起来像这样:
dt <- data.table(
datetime = seq(as.POSIXct("2000-01-01 00:00:00"),as.POSIXct("2000-01-01 09:00:00"), by = "1 hour"),
temp1 = seq(30, 21, by = -1),
temp2 = c(30, seq(30, 25, by = -1), seq(25, 23, by = -1)),
lag = c(0, 1, 1, 1, 1, 1, 2, 2, NA, NA)
)
谢谢您的帮助:)
1)减法如果简单的减法足够,则:
:library(data.table)
dt[, lag := temp2 - temp1]
给予:
> dt
datetime temp1 temp2 lag
1: 2000-01-01 00:00:00 30 30 0
2: 2000-01-01 01:00:00 29 30 1
3: 2000-01-01 02:00:00 28 29 1
4: 2000-01-01 03:00:00 27 28 1
5: 2000-01-01 04:00:00 26 27 1
6: 2000-01-01 05:00:00 25 26 1
7: 2000-01-01 06:00:00 24 25 1
8: 2000-01-01 07:00:00 23 25 2
9: 2000-01-01 08:00:00 22 24 2
10: 2000-01-01 09:00:00 21 23 2
2)dtw 另一种可能性是动态时间扭曲。您可能需要根据想要的内容来自定义此问题,但例如尝试以下操作:
library(data.table)
library(dtw)
fm <- dt[, dtw(temp1, temp2)]
dt[, lag := tapply(fm$index2 - fm$index1, fm$index1, min)]
给予:
> dt
datetime temp1 temp2 lag
1: 2000-01-01 00:00:00 30 30 0
2: 2000-01-01 01:00:00 29 30 1
3: 2000-01-01 02:00:00 28 29 1
4: 2000-01-01 03:00:00 27 28 1
5: 2000-01-01 04:00:00 26 27 1
6: 2000-01-01 05:00:00 25 26 1
7: 2000-01-01 06:00:00 24 25 2
8: 2000-01-01 07:00:00 23 25 2
9: 2000-01-01 08:00:00 22 24 1
10: 2000-01-01 09:00:00 21 23 0
注意: ccf
功能在这里也许也有帮助。