R中使用Leave One out方法的线性回归预测



我有3个使用mtcar构建的线性回归模型,并希望使用这些模型为mtcar表的每一行生成预测。这些预测应该作为mtcars数据帧的附加列(3个附加列(添加,并且应该使用留一方法在for循环中生成。此外,模型1和模型2的预测应该通过";分组";气缸号而用模型3进行的预测应该在不进行任何分组的情况下完成。

到目前为止,我已经能够用一个循环中的单一模型获得一些东西:

model1 =lm(hp ~ mpg, data = mtcars)
model2 =lm(hp ~ mpg + hp, data = mtcars)
model3 =lm(hp ~ mpg + hp + wt, data = mtcars)
fitted_value <- NULL
for(i in 1:nrow(mtcars)){

validation<-mtcars[i,]
training<-mtcars[-i,]
model1<-lm(mpg ~ hp, data = training)
fitted_value[i] <-predict(model1, newdata = validation)
}```

I would like to be able to generate all the model predictions by first putting all the models in a list or vector and attaching the result to the mtcars dataframe. Somthing lke thislike this:
```model1 =lm(hp ~ mpg, data = mtcars)
model2 =lm(hp ~ mpg + hp, data = mtcars)
model3 =lm(hp ~ mpg + hp + wt, data = mtcars)
models <- list(model1, model2, model3)
fitted_value <- NULL
for(i in 1:nrow(mtcars)){

validation<-mtcars[i,]
training<-mtcars[-i,]
fitted_value[i] <-predict(models, newdata = validation)
}```
Thank you for you help

您可以使用嵌套的map来适应每行i的三个公式中的每一个。然后仅使用bind_colsmtcars来附加预测。

library(tidyverse)
frml_1 <- as.formula("hp ~ mpg")
frml_2 <- as.formula("hp ~ mpg + drat")
frml_3 <- as.formula("hp ~ mpg + drat + wt")
frmls <- list(frml_1 = frml_1, frml_2 = frml_2, frml_3 = frml_3)
mtcars %>%
bind_cols(
map(1:nrow(mtcars), function(i) {
map_dfc(frmls, function(frml) {
training <- mtcars[-i, ]
fit <- lm(frml, data = training)

validation <- mtcars[i, ]
predict(fit, newdata = validation)
})
}) %>%
bind_rows()
)
mpg cyl  disp  hp drat    wt  qsec vs am gear carb    frml_1    frml_2    frml_3
Mazda RX4           21.0   6 160.0 110 3.90 2.620 16.46  0  1    4    4 138.65796 138.65796 140.61340
Mazda RX4 Wag       21.0   6 160.0 110 3.90 2.875 17.02  0  1    4    4 138.65796 138.65796 139.55056
Datsun 710          22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1 122.76445 122.76445 124.91348
Hornet 4 Drive      21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1 135.12607 135.12607 134.36670
Hornet Sportabout   18.7   8 360.0 175 3.15 3.440 17.02  0  0    3    2 158.96634 158.96634 158.85438
Valiant             18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1 164.26418 164.26418 164.42112
Duster 360          14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4 197.81716 197.81716 199.74665
...

注意,公式已从RHS中删除hp,因为hp也是响应。我使用drat进行演示。

我可以通过执行以下脚本来实现这一点:

fitted_value1 <- NULL
fitted_value2 <- NULL
fitted_value3 <- NULL
for(i in 1:nrow(mtcars)){
validation<-mtcars[i,] 
training<-mtcars[-i,]
model1 =lm(hp ~ mpg, data = training)
model2 =lm(hp ~ mpg + hp, data = training)
model3 =lm(hp ~ mpg + hp + wt, data = training)
fitted_value1[i] <-predict(model1, newdata = validation)
fitted_value2[i] <-predict(model2, newdata = validation)
fitted_value3[i] <-predict(model3, newdata = validation)
res<- as.data.frame(cbind(mtcars,fitted_value1,fitted_value2,fitted_value3))
}

如何改进此代码?我想将模型从循环中取出,将它们保存为列表,并且只引用循环中的列表。这或多或少是我理想中想要的(但它不起作用(:

model1 =lm(hp ~ mpg, data = mtcars)
model2 =lm(hp ~ mpg + hp, data = mtcars)
model3 =lm(hp ~ mpg + hp + wt, data = mtcars)
models <- list(model1, model2, model3)
fitted_value <- NULL
for(i in 1:nrow(mtcars)){
for (j in models){
validation<-mtcars[i,]
training<-mtcars[-i,]
fitted_value[i] <-predict(models[j], newdata = validation)
# this should save the predictions for all the models and append it to the original dataframe
df <- cbind(mtcars,fitted_value) 
}
}

感谢您的帮助

最新更新