如何通过R中所有变量的组合来修改这些DPLYR代码以进行多个线性回归



我有以下数据

ind1 <- rnorm(99)
ind2 <- rnorm(99)
ind3 <- rnorm(99)
ind4 <- rnorm(99)
ind5 <- rnorm(99)
dep <- rnorm(99, mean=ind1)
group <- rep(c("A", "B", "C"), each=33)
df <- data.frame(dep,group, ind1, ind2, ind3, ind4, ind5)

以下代码按组计算依赖性变量和2个自变量之间的多个线性回归,这正是我要做的。但是我想一次针对所有自变量组合对的DEP变量。那么如何在此代码中结合其他模型?

df %>% 
  nest(-group) %>% 
  mutate(fit = map(data, ~ lm(dep ~ ind1 + ind2, data = .)),
         results1 = map(fit, glance),
         results2 = map(fit, tidy)) %>% 
  unnest(results1) %>% 
  unnest(results2) %>% 
  select(group, term, estimate, r.squared, p.value, AIC) %>% 
  mutate(estimate = exp(estimate)) 

预先感谢!

不是一个完整的答案。考虑使用lapplycombn构建后,将线性公式与rapply构建所有可能的组合,然后进入您的整理方法:

indvar_list <- lapply(1:5, function(x) 
                 combn(paste0("ind", 1:5), x, , simplify = FALSE))
formulas_list <- rapply(indvar_list, function(x)
                   as.formula(paste("dep ~", paste(x, collapse="+"))))
run_model <- function(f) {    
    df %>% 
      nest(-group) %>% 
      mutate(fit = map(data, ~ lm(f, data = .)),
             results1 = map(fit, glance),
             results2 = map(fit, tidy)) %>% 
      unnest(results1) %>% 
      unnest(results2) %>% 
      select(group, term, estimate, r.squared, p.value, AIC) %>% 
      mutate(estimate = exp(estimate))
}
tibble_list <- lapply(formulas_list, run_model)

最新更新