r语言 - "group_by",并将分组级别保留为嵌套数据框的名称



这个问题与can';不要在地图中使用emmeans

我正在使用以下代码进行数据分析的几个步骤。我想将分组因子的级别保留为嵌套数据帧的名称,并使用这些名称来标识过程中的每个步骤,而不是使用默认的枚举[[1]]、[[2]]、[[3]]等。我不理解我遇到的错误。请看我如何修复我的代码。

library(dplyr)
library(purrr)
library(emmeans)
data("warpbreaks")
wb_emm <-  warpbreaks %>%
group_by(tension) %>% 
setNames(unique(.x$tension)) %>%
nest() %>%
mutate(models=map(data,~glm(breaks~wool,data=.x))) %>%
mutate(jt = map(models, ~emmeans::joint_tests(.x, data = .x$data))) %>%
mutate(means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data))) %>%
mutate(p_cont = map(means, ~emmeans::contrast(.x, "pairwise",infer = c(T,T))))
Error in unique(.x$tension) : object '.x' not found

我最初做了group_by(tension) %>% setNames(unique(tension)),得到了Error in unique(tension) : object 'tension' not found我也尝试过split(.$tension),但它与nest()冲突

tension级别清晰可见。

unique(warpbreaks$tension)
[1] L M H
Levels: L M H

该代码在没有setNames(unique(.x$tension)) %>%步骤的情况下运行良好。

wb_emm$p_cont
[[1]]
contrast estimate   SE  df asymp.LCL asymp.UCL z.ratio p.value
A - B        16.3 6.87 Inf      2.87      29.8 2.378   0.0174 
Confidence level used: 0.95 
[[2]]
contrast estimate   SE  df asymp.LCL asymp.UCL z.ratio p.value
A - B       -4.78 4.27 Inf     -13.1      3.59 -1.119  0.2630 
Confidence level used: 0.95 
[[3]]
contrast estimate   SE  df asymp.LCL asymp.UCL z.ratio p.value
A - B        5.78 3.79 Inf     -1.66      13.2 1.523   0.1277 
Confidence level used: 0.95 

谢谢。

更新:在下面Ronak Shah提供的第二个解决方案中,我尝试了diamonds,但名称没有改变。该代码使用ungroup()%>%ungroup%>%运行。

diamonds %>%
group_by(cut) %>%
nest() %>% 
ungroup %>%
mutate(models=map(data,~glm(price ~ x + y + z + clarity + color,data=.x)),
jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
means=map(models,~emmeans::emmeans(.x,"color",data=.x$data)),
p_cont = map(means, ~emmeans::contrast(.x, "pairwise",infer = c(T,T))),
across(models:p_cont, stats::setNames,  .$cut)) -> diamond_result
> diamond_result$jt
[[1]]
model term df1 df2 F.ratio p.value
x            1 Inf 611.626 <.0001 
y            1 Inf   2.914 0.0878 
z            1 Inf 100.457 <.0001 
clarity      7 Inf 800.852 <.0001 
color        6 Inf 256.796 <.0001 

您需要在map步骤中添加setNames

library(tidyverse)
warpbreaks %>%
group_by(tension) %>% 
nest() %>%
ungroup %>%
mutate(models=map(data,~glm(breaks~wool,data=.x)),
jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data)),
p_cont = setNames(map(means, 
~emmeans::contrast(.x, "pairwise",infer = c(T,T))),.$tension))

如果要命名所有列表输出,请使用across:

warpbreaks %>%
group_by(tension) %>% 
nest() %>%
ungroup %>%
mutate(models=map(data,~glm(breaks~wool,data=.x)),
jt = map(models, ~emmeans::joint_tests(.x, data = .x$data)),
means=map(models,~emmeans::emmeans(.x,"wool",data=.x$data)),
p_cont = map(means, ~emmeans::contrast(.x, "pairwise",infer = c(T,T))),
across(models:p_cont, setNames,  .$tension)) -> result
result$jt
#$L
# model term df1 df2 F.ratio p.value
# wool         1 Inf   5.653 0.0174 

#$M
# model term df1 df2 F.ratio p.value
# wool         1 Inf   1.253 0.2630 

#$H
# model term df1 df2 F.ratio p.value
# wool         1 Inf   2.321 0.1277 

最新更新