r语言 - 以间隙长度为条件的数据插值



我想使用样条方法插值一个时间序列。我想使用"间隙容差",如果连续NA天数>x天,数据将保持NA,不会值。在我的示例中,假设连续三天以上的NAs,我不会插值。示例数据:

x <- seq(as.Date("2016-01-01"),as.Date("2016-01-31"),by="day")
y <- c(0.45062130 ,0.51136174 ,NA ,NA ,0.29481738 ,NA ,0.27713756 ,0.62638512 ,0.23547530,0.29253901 ,0.75899501 ,0.67779756 ,0.51831742 ,0.08050147 ,0.71183739 ,NA ,0.79406706 ,NA,0.03434758 ,0.59573892 ,0.22102821 ,0.13154414 ,NA ,NA ,NA ,NA ,0.23692593,0.95215104 ,0.38810846 ,0.17970580 ,0.05176054)
df <- data.frame(x,y)
> df
x          y
2016-01-01 0.45062130
2016-01-02 0.51136174
2016-01-03         NA
2016-01-04         NA
2016-01-05 0.29481738
2016-01-06         NA
2016-01-07 0.27713756
2016-01-08 0.62638512
2016-01-09 0.23547530
2016-01-10 0.29253901
2016-01-11 0.75899501
2016-01-12 0.67779756
2016-01-13 0.51831742
2016-01-14 0.08050147
2016-01-15 0.71183739
2016-01-16         NA
2016-01-17 0.79406706
2016-01-18         NA
2016-01-19 0.03434758
2016-01-20 0.59573892
2016-01-21 0.22102821
2016-01-22 0.13154414
2016-01-23         NA
2016-01-24         NA
2016-01-25         NA
2016-01-26         NA
2016-01-27 0.23692593
2016-01-28 0.95215104
2016-01-29 0.38810846
2016-01-30 0.17970580
2016-01-31 0.05176054

我的一个想法是创建 2 个新数据框。第一个是完全插值的,第二个在间隙公差下去除NAs,然后合并。有没有更好的方法可以做到这一点?

我想要的数据集如下所示:

> df
x          y
2016-01-01 0.45062130
2016-01-02 0.51136174
2016-01-03 0.35684617
2016-01-04 0.30481738
2016-01-05 0.29481738
2016-01-06 0.28481738
2016-01-07 0.27713756
2016-01-08 0.62638512
2016-01-09 0.23547530
2016-01-10 0.29253901
2016-01-11 0.75899501
2016-01-12 0.67779756
2016-01-13 0.51831742
2016-01-14 0.08050147
2016-01-15 0.71183739
2016-01-16 0.75158886
2016-01-17 0.79406706
2016-01-18 0.21584455
2016-01-19 0.03434758
2016-01-20 0.59573892
2016-01-21 0.22102821
2016-01-22 0.13154414
2016-01-23         NA
2016-01-24         NA
2016-01-25         NA
2016-01-26         NA
2016-01-27 0.23692593
2016-01-28 0.95215104
2016-01-29 0.38810846
2016-01-30 0.17970580
2016-01-31 0.05176054

尝试在动物园包中na.spline。 (fortify.zoo(z)会将z转换回数据框,尽管您可能更愿意将其保留为动物园形式以利用那里的其他设施。 另请查看动物园中的其他 na.* 函数。

library(zoo)
z <- na.spline(zoo(y, x), maxgap = 2)

给:

> z
2016-01-01 2016-01-02 2016-01-03 2016-01-04 2016-01-05 2016-01-06 2016-01-07 
0.45062130 0.51136174 0.50365727 0.43252778 0.29481738 0.14613360 0.27713756 
2016-01-08 2016-01-09 2016-01-10 2016-01-11 2016-01-12 2016-01-13 2016-01-14 
0.62638512 0.23547530 0.29253901 0.75899501 0.67779756 0.51831742 0.08050147 
2016-01-15 2016-01-16 2016-01-17 2016-01-18 2016-01-19 2016-01-20 2016-01-21 
0.71183739 1.06652092 0.79406706 0.17526465 0.03434758 0.59573892 0.22102821 
2016-01-22 2016-01-23 2016-01-24 2016-01-25 2016-01-26 2016-01-27 2016-01-28 
0.13154414         NA         NA         NA         NA 0.23692593 0.95215104 
2016-01-29 2016-01-30 2016-01-31 
0.38810846 0.17970580 0.05176054 

最新更新