for 循环在 R 中列表中的所有数据帧中执行方差分析测试



我有我的数据帧:

df <- read.table(text = "id G1  G2  G3  value
1   A   D20 TAN 1
2   A   D20 TAN 9
3   A   D20 TAN 10
4   A   D40 TAN 8
5   A   D40 TAN 3
6   A   D40 TAN 9
7   A   D60 TAN 5
8   A   D60 TAN 5
9   A   D60 TAN 10
10  B   D20 TAN 7
11  B   D20 TAN 8
12  B   D20 TAN 10
13  B   D40 TAN 8
14  B   D40 TAN 3
15  B   D40 TAN 7
16  B   D60 TAN 1
17  B   D60 TAN 10
18  B   D60 TAN 1
19  C   D20 TAN 5
20  C   D20 TAN 9
21  C   D20 TAN 4
22  C   D40 TAN 6
23  C   D40 TAN 3
24  C   D40 TAN 8
25  C   D60 TAN 9
26  C   D60 TAN 10
27  C   D60 TAN 4
28  A   D20 BBC 9
29  A   D20 BBC 3
30  A   D20 BBC 7
31  A   D40 BBC 10
32  A   D40 BBC 7
33  A   D40 BBC 4
34  A   D60 BBC 2
35  A   D60 BBC 3
36  A   D60 BBC 8
37  B   D20 BBC 8
38  B   D20 BBC 1
39  B   D20 BBC 5
40  B   D40 BBC 6
41  B   D40 BBC 2
42  B   D40 BBC 6
43  B   D60 BBC 9
44  B   D60 BBC 2
45  B   D60 BBC 10
46  C   D20 BBC 3
47  C   D20 BBC 1
48  C   D20 BBC 4
49  C   D40 BBC 10
50  C   D40 BBC 8
51  C   D40 BBC 3
52  C   D60 BBC 5
53  C   D60 BBC 3
54  C   D60 BBC 1",stringsAsFactors = FALSE, header = TRUE)

我通过以下方式制作一个额外的列:

df$Group<-paste(df$G2,df$G3)

然后我按Group df拆分为一个列表:

L1<-split(df,df$Group)

现在我想对L1中的每个表进行方差分析和Tukey测试 例如:

a1<-aov(L1$`D20 BBC`$value~L1$`D20 BBC`$G1)
TukeyHSD(a1)

但它只是一张桌子。如何使用 for 循环对L1中的所有表执行aov函数,然后对所有aov结果执行TukeyHSD函数?

您可以在

lapply中执行此操作。

lapply(L1, function(x) with(x, TukeyHSD(aov(value ~ G1))))

实际上有一个函数by将函数应用于分割的数据框,因此您可以执行以下操作:

by(df, df$Group, function(x) with(x, TukeyHSD(aov(value ~ G1))))
# diff        lwr      upr     p adj
# B-A -1.666667  -8.752543 5.419210 0.7604243
# C-A -3.666667 -10.752543 3.419210 0.3205994
# C-B -2.000000  -9.085876 5.085876 0.6792890
# -------------------------------------------------------------------------------- 
#   diff        lwr       upr     p adj
# B-A  1.6666667  -6.725769 10.059102 0.8205065
# C-A -0.6666667  -9.059102  7.725769 0.9679553
# C-B -2.3333333 -10.725769  6.059102 0.6866510
# -------------------------------------------------------------------------------- 
#   diff       lwr      upr     p adj
# B-A -2.333333e+00 -9.895291 5.228624 0.6334637
# C-A  1.776357e-15 -7.561958 7.561958 1.0000000
# C-B  2.333333e+00 -5.228624 9.895291 0.6334637
# -------------------------------------------------------------------------------- 
#   diff       lwr      upr     p adj
# B-A -0.6666667 -7.703163 6.369830 0.9548296
# C-A -1.0000000 -8.036497 6.036497 0.9021379
# C-B -0.3333333 -7.369830 6.703163 0.9884428
# -------------------------------------------------------------------------------- 
#   diff        lwr       upr     p adj
# B-A  2.666667  -5.684119 11.017452 0.6148213
# C-A -1.333333  -9.684119  7.017452 0.8786205
# C-B -4.000000 -12.350785  4.350785 0.3681421
# -------------------------------------------------------------------------------- 
#   diff        lwr       upr     p adj
# B-A -2.666667 -12.441010  7.107677 0.6957155
# C-A  1.000000  -8.774344 10.774344 0.9475956
# C-B  3.666667  -6.107677 13.441010 0.5210071

tidyverse的方法可能是:

df %>%
 group_split(Group, keep = FALSE) %>%
 map(~ TukeyHSD(aov(value ~ G1, data = .)))
[[1]]
  Tukey multiple comparisons of means
    95% family-wise confidence level
Fit: aov(formula = value ~ G1, data = .)
$G1
         diff        lwr      upr     p adj
B-A -1.666667  -8.752543 5.419210 0.7604243
C-A -3.666667 -10.752543 3.419210 0.3205994
C-B -2.000000  -9.085876 5.085876 0.6792890

加上broom tidy()

df %>%
 group_split(Group, keep = FALSE) %>%
 map(~ TukeyHSD(aov(value ~ G1, data = .))) %>%
 map(tidy)
[[1]]
# A tibble: 3 x 6
  term  comparison estimate conf.low conf.high adj.p.value
  <chr> <chr>         <dbl>    <dbl>     <dbl>       <dbl>
1 G1    B-A           -1.67    -8.75      5.42       0.760
2 G1    C-A           -3.67   -10.8       3.42       0.321
3 G1    C-B           -2.00    -9.09      5.09       0.679

最新更新