我想在数据帧中线性插值时间序列数据(以60秒为步长,如下所示,其中seconds
是要插值的变量:
mydf <- data.frame(measurement = c(2,6,4,1,7), other_measurement = c(2,5,8,6,4), seconds = c(60,60,120,360,360))
(实际数据帧比这个大很多量级。(
然后,一旦这些60的中间倍数被插值,我想填充用NA生成的单元格。我找到了一些接近这一点的解决方案,比如创建一个60的倍数的数据帧,并将其与我的数据帧合并,但没有一个能准确工作,仍然填充了多列中所有缺失的单元格。非常感谢。
基R
newdf <- data.frame(seconds = do.call(seq, c(as.list(range(mydf$seconds)), by = 60)))
merge(mydf, newdf, all = TRUE)
# seconds measurement other_measurement
# 1 60 2 2
# 2 60 6 5
# 3 120 4 8
# 4 180 NA NA
# 5 240 NA NA
# 6 300 NA NA
# 7 360 1 6
# 8 360 7 4
dplyr
library(dplyr)
mydf %>%
summarize(seconds = seq(min(seconds), max(seconds), by = 60)) %>%
full_join(mydf, ., by = "seconds")
# measurement other_measurement seconds
# 1 2 2 60
# 2 6 5 60
# 3 4 8 120
# 4 1 6 360
# 5 7 4 360
# 6 NA NA 180
# 7 NA NA 240
# 8 NA NA 300