对R中的几个数据子集执行线性混合模型



考虑lme4包中的睡眠研究数据,如下所示。包含18名受试者,他们在不同的日子重复测量反应。假设有一个额外的变量Days2,如下所示:

library("lme4")
sleepstudy$Days2<- rep(10:18, rep(20,9)) 
sleepstudy
Reaction Days Subject Age Days2
1 249.5600    0     308  20    10
2 258.7047    1     308  20    10
3 250.8006    2     308  20    10
...
178 343.2199    7     372  28    18
179 369.1417    8     372  28    18
180 364.1236    9     372  28    18

该数据有9个唯一的日期2,即10,11,。。。,18.制作9个子集数据,如下所示:

  • 子集数据1将所有Days2为10及以上的人,这只是整个数据集
  • 子集数据2将所有第2天为11天及以上的人
  • 第9子集合将所有第2天为18天及以上的人
tt <- sort(unique(sleepstudy$Days2))
Subset1 <- sleepstudy[sleepstudy$Days2>=tt[1], ]
Subset2<-sleepstudy[sleepstudy$Days2>=tt[2], ]
...
Subset9<-sleepstudy[sleepstudy$Days2>=tt[length(tt)], ] 

对9个子集的数据执行随机截距的单独线性混合模型,然后在Days=tt[i]执行预测,如下所示:

fit1 = lmer(Reaction ~ Days + (1 | Subject), data = Subset1)
newSubset1 <- data.frame(   Days = tt[1], Subject = unique(Subset1$Subject))
newSubset1$Predicted_Response <- predict(fit1, newdata = newSubset1)
fit2 = lmer(Reaction ~ Days + (1 | Subject), data = Subset2)
newSubset2 <- data.frame(   Days = tt[2], Subject = unique(Subset2$Subject))
newSubset2$Predicted_Response <- predict(fit2, newdata = newSubset2)
...
fit9 = lmer(Reaction ~ Days + (1 | Subject), data = Subset9)
newSubset9 <- data.frame(  Days = tt[9], Subject = unique(Subset9$Subject) )
newSubset9$Predicted_Response <- predict(fit9, newdata = newSubset9)

将输出合并到一个数据集的最后一步

FinalOutput<-rbind( newSubset1,newSubset2,...,newSubset9 )
FinalOutput
Days Subject Predicted_Response
10     308           396.8617
10     309           278.2284
10     310           292.9694
...
11     372           383.3525
...
18     371           434.8685
18     372           454.5697

以上步骤均为手动操作。如何使用步骤的泛化来获得R中的最终输出?也许类似于:

for(i in length(tt)){
Subset[i]<-sleepstudy[sleepstudy$Days2>=tt[i], ] 
fit[i]= lmer(Reaction ~ Days + (1 | Subject), data = Subset[i])
newSubset[i] <- data.frame(  Days = tt[i], Subject = unique(Subset[i]$Subject) )
newSubset[i]$Predicted_Response <- predict(fit[i], newdata = newSubset[i])
...

purrr::map_dfr()(a(获取特定于子集的函数,(b(将该函数应用于每个子集,(c(将所有新子集组合为一个数据帧

library("lme4")
ds_sample <-
lme4::sleepstudy |> 
dplyr::mutate(
Days2 = rep(10:18, rep(20,9)) 
)
predict_reaction <- function (.days_2) {
ds_subset <- ds_sample[.days_2 <= ds_sample$Days2, ]
fit = lmer(Reaction ~ Days + (1 | Subject), data = ds_subset)
newSubset1 <- data.frame(   Days = .days_2, Subject = unique(ds_subset$Subject))
newSubset1$Predicted_Response <- predict(fit, newdata = newSubset1)
newSubset1
}
sort(unique(ds_sample$Days2)) |> 
purrr::map_dfr(predict_reaction)

输出:

Days Subject Predicted_Response
1    10     308           396.8617
2    10     309           278.2284
3    10     310           292.9694
4    10     330           360.4844
5    10     331           366.2942
6    10     332           364.2992
7    10     333           372.5785
8    10     334           353.0810
9    10     335           310.7958
...
87   17     371           453.4433
88   17     372           468.7271
89   18     371           434.8685
90   18     372           454.5697

最新更新