r语言 - 如何使用tapply对因子的每个水平执行t检验



我的数据和代码如下:

my_vector <- rnorm(150)
my_factor1 <- gl(3,50)
my_factor2 <- gl(2,75)
tapply(my_vector, my_factor1, function(x)
  t.test(my_vector~my_factor2, paired=T))

我想对my_factor1的每个水平做一个单独的t检验,以测试my_vector对my_factor2的两个水平。

然而,在我的代码中,t-test并没有拆分my_factor1的级别,并且每个级别的结果都是相等的,因为my_vector完全包含在每个t.test中。

这是我的代码的输出:

$`1`
Paired t-test
data:  my_vector by my_factor2
t = 0.2448, df = 74, p-value = 0.8073
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2866512  0.3669667
sample estimates:
mean of the differences 
         0.04015775 

$`2`
Paired t-test
data:  my_vector by my_factor2
t = 0.2448, df = 74, p-value = 0.8073
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2866512  0.3669667
sample estimates:
mean of the differences 
         0.04015775 

$`3`
Paired t-test
data:  my_vector by my_factor2
t = 0.2448, df = 74, p-value = 0.8073
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -0.2866512  0.3669667
sample estimates:
mean of the differences 
         0.04015775 

我错过了什么或做错了什么?

你的例子有点问题,因为如果你设置:

df <- data.frame(my_vector = rnorm(150),
                 my_factor1 = gl(3,50),
                 my_factor2 = gl(2,75)
                )

my_factor1 = 1或3时,由于重复重叠的方式,您将只有一个唯一的my_factor2值。参见?gl。这么做:

df <- data.frame(my_vector = rnorm(150),
                 my_factor1 = gl(3,1,150),
                 my_factor2 = gl(2,1,150)
                )
with(df,
       by(df, my_factor1,
          function(x) t.test(my_vector ~ my_factor2, data=x)
       )
     )

似乎产生了您想要的输出。

作为旁注—考虑对多个比较进行更正:https://stats.stackexchange.com/questions/16779/when-is-multiple-comparison-correction-necessary

最新更新