我正在尝试编写一个脚本,该脚本将LOESS平滑应用于我的数据帧的所有列。我使用的是具有loess.as
功能的fANCOVA
软件包,该软件包具有自动参数选择功能。
这是我的数据帧dat
> dat
date AUSTRIA GERMANY SWITZERLAND
1 5/1/2022 3908 15769 1023
2 5/2/2022 5541 87119 2994
3 5/3/2022 7071 106908 2148
4 5/4/2022 5920 89796 2150
5 5/5/2022 6023 80521 1918
6 5/6/2022 5075 67328 1909
7 5/7/2022 3429 26870 1183
8 5/8/2022 3883 13256 834
9 5/9/2022 4681 84985 2496
10 5/10/2022 5894 94974 1965
11 5/11/2022 5000 77460 1846
12 5/12/2022 4726 65100 1708
13 5/13/2022 3681 56920 1545
14 5/14/2022 3293 20062 937
15 5/15/2022 2478 9603 696
16 5/16/2022 3247 65791 1939
17 5/17/2022 4376 68200 1476
18 5/18/2022 3365 53791 1426
19 5/19/2022 3117 44358 1269
20 5/20/2022 2693 39967 1080
21 5/21/2022 1734 13764 773
22 5/22/2022 1744 6922 592
23 5/23/2022 2403 47125 1459
24 5/24/2022 3130 46690 1113
25 5/25/2022 2597 35348 1084
26 5/26/2022 1736 NA 503
27 5/27/2022 1813 NA 1162
28 5/28/2022 1881 NA NA
29 5/29/2022 1736 NA NA
30 5/30/2022 2389 NA NA
31 5/31/2022 3571 NA NA
然后将loess.as
应用于奥地利,例如
dat$date <- as.Date(dat$date, format = "%m/%d/%Y")
dat$DATE_NUM = as.numeric(dat$date)
fit <- loess.as(df$DATE_NUM,df$AUSTRIA, degree=2, criterion="gcv", family ="gaussian",user.span = NULL, plot = F)$fitted
fit
将返回奥地利所需的拟合值。我希望能够在其他国家的列中执行相同的操作(注意,每个列的长度不同(,并且输出应该在类似于dat
结构的列中。我曾试图寻找类似问题的解决方案,但我很难跟上,遇到了错误。
我们可以在base R
中使用lapply/sapply
library(fANCOVA)
out <- sapply(dat[c("AUSTRIA", "GERMANY", "SWITZERLAND")], (x) {
i1 <- complete.cases(x)
replace(x, i1, loess.as(dat$DATE_NUM[i1], x[i1], degree = 2,
criterion = "gcv",family ="gaussian",user.span = NULL, plot = FALSE)$fitted)
})
-输出
> head(out)
AUSTRIA GERMANY SWITZERLAND
[1,] 4014.674 18716.87 1598.758
[2,] 5564.295 79813.73 2015.231
[3,] 6384.348 104581.65 2200.749
[4,] 6497.308 94165.40 2204.399
[5,] 5770.513 81417.57 1895.429
[6,] 4748.673 61337.24 1547.563
数据
dat <- structure(list(date = structure(c(19113, 19114, 19115, 19116,
19117, 19118, 19119, 19120, 19121, 19122, 19123, 19124, 19125,
19126, 19127, 19128, 19129, 19130, 19131, 19132, 19133, 19134,
19135, 19136, 19137, 19138, 19139, 19140, 19141, 19142, 19143
), class = "Date"), AUSTRIA = c(3908L, 5541L, 7071L, 5920L, 6023L,
5075L, 3429L, 3883L, 4681L, 5894L, 5000L, 4726L, 3681L, 3293L,
2478L, 3247L, 4376L, 3365L, 3117L, 2693L, 1734L, 1744L, 2403L,
3130L, 2597L, 1736L, 1813L, 1881L, 1736L, 2389L, 3571L), GERMANY = c(15769L,
87119L, 106908L, 89796L, 80521L, 67328L, 26870L, 13256L, 84985L,
94974L, 77460L, 65100L, 56920L, 20062L, 9603L, 65791L, 68200L,
53791L, 44358L, 39967L, 13764L, 6922L, 47125L, 46690L, 35348L,
NA, NA, NA, NA, NA, NA), SWITZERLAND = c(1023L, 2994L, 2148L,
2150L, 1918L, 1909L, 1183L, 834L, 2496L, 1965L, 1846L, 1708L,
1545L, 937L, 696L, 1939L, 1476L, 1426L, 1269L, 1080L, 773L, 592L,
1459L, 1113L, 1084L, 503L, 1162L, NA, NA, NA, NA), DATE_NUM = c(19113,
19114, 19115, 19116, 19117, 19118, 19119, 19120, 19121, 19122,
19123, 19124, 19125, 19126, 19127, 19128, 19129, 19130, 19131,
19132, 19133, 19134, 19135, 19136, 19137, 19138, 19139, 19140,
19141, 19142, 19143)), row.names = c("1", "2", "3", "4", "5",
"6", "7", "8", "9", "10", "11", "12", "13", "14", "15", "16",
"17", "18", "19", "20", "21", "22", "23", "24", "25", "26", "27",
"28", "29", "30", "31"), class = "data.frame")