我有一些6分钟频率的热电偶数据。热电偶安装在不同的高度,在每个高度都有许多按径向位置区分的热电偶
DT_TI_RECORDED HEIGHT POS TEMPERATURE
2018-05-16 00:00:00 1 90 111
2018-05-16 00:00:00 1 180 112
2018-05-16 00:00:00 1 270 113
2018-05-16 00:00:00 2 90 112
2018-05-16 00:00:00 2 180 114
2018-05-16 00:00:00 2 270 115
2018-05-16 00:00:00 3 90 112
2018-05-16 00:00:00 3 180 112
2018-05-16 00:00:00 3 270 113
...
2018-05-16 00:06:00 1 90 111
2018-05-16 00:06:00 1 180 112
2018-05-16 00:06:00 1 270 113
2018-05-16 00:06:00 2 90 112
2018-05-16 00:06:00 2 180 114
2018-05-16 00:06:00 2 270 112
2018-05-16 00:06:00 3 90 114
2018-05-16 00:06:00 3 180 112
2018-05-16 00:06:00 3 270 114
...
对于每个独特的高度和位置组合,每 6 分钟我想计算一个向后n小时移动方差,假设每小时 4 小时。
我尝试复制的原始代码是为 SAS 统计包编写
的PROC EXPAND DATA=Raw_data
OUT=Moving_Variance
ALIGN = BEGINNING
;
by HEIGHT POS;
ID DT_TI_RECORDED ;
CONVERT TEMPERATURE = Moving_4hour_Var / METHOD = none TRANSFORMOUT = (MOVVAR 40);
#/* 40 obs at 6min freq = 4hour moving variance*/
QUIT;
我花了几个小时在谷歌上搜索,我认为我需要使用的 R 库叫做zoo
我想要的功能rollapply
但我无法弄清楚如何将聚合与rollapply
相结合。
我试过了
moving_var <- Raw_data %>%
aggregate(HEIGHT,POS) %>%
rollapply( TEMPERATURE, width = 40, FUN = sd, fill = NA)
但不起作用。我对R编程很陌生,这让我发疯。
尝试以下聚合:
library(zoo)
result = aggregate(temp ~ pos + height,
data = df,
FUN = function(x){
rollapply(x, width = 40, FUN = var, by = 40)
}
)
width
是滚动窗口的宽度,而by
是下一个窗口起点跳过的点数。每个窗口中有 40 个,您将在前一个窗口的末尾旁边获得每个窗口的开始。
生成的数据框的每个窗口都有一列。这种结构可以被认为是"宽的"。如果要将其设置为"长"格式,请使用 tidyr 中的gather
或 reshape2 中的melt
。
例:
df = structure(list(pos = c(0, 90, 180, 270, 0, 90, 180, 270, 0, 90,
180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270,
0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180,
270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90,
180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270,
0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180,
270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90,
180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270,
0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180,
270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90,
180, 270, 0, 90, 180, 270, 0, 90, 180, 270, 0, 90, 180, 270,
0, 90, 180, 270), height = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L,
3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L), temp = 1:160), .Names = c("pos",
"height", "temp"), row.names = c(NA, -160L), class = "data.frame")
> head(df,20)
pos height temp
1 0 1 1
2 90 1 2
3 180 1 3
4 270 1 4
5 0 2 5
6 90 2 6
7 180 2 7
8 270 2 8
9 0 3 9
10 90 3 10
11 180 3 11
12 270 3 12
13 0 4 13
14 90 4 14
15 180 4 15
16 270 4 16
17 0 1 17
18 90 1 18
19 180 1 19
20 270 1 20
library(zoo)
result = aggregate(temp ~ pos + height,
data = df,
FUN = function(x){
rollapply(x, width = 3, FUN = var, by = 3)
}
)
将导致:
pos height temp.1 temp.2 temp.3
1 0 1 256 256 256
2 90 1 256 256 256
3 180 1 256 256 256
4 270 1 256 256 256
5 0 2 256 256 256
6 90 2 256 256 256
7 180 2 256 256 256
8 270 2 256 256 256
9 0 3 256 256 256
10 90 3 256 256 256
11 180 3 256 256 256
12 270 3 256 256 256
13 0 4 256 256 256
14 90 4 256 256 256
15 180 4 256 256 256
16 270 4 256 256 256