我的数据看起来像这样
q3a q3b q3c q3d q3e ... q10d grp
1 2 3 4 5 ... 1 1
2 1 2 3 4 2 1
3 2 1 5 2 ... 1 2
2 1 2 1 1 ... 2 2
2 3 4 1 2 ... 3 3
我想对每个问题运行单因素方差和邓肯事后测试。对于 q3a,代码将是
library("DescTools")
q3a <- aov(q3a ~ grp,data = pgy)
PostHocTest(q3a,method = "duncan")
如何编写 foreach
循环来迭代数据中每个变量的相同模型?
## List of variables:
> dput(names(duncan))
c("q3a", "q3b", "q3c", "q3d", "q3e", "q4a", "q4b", "q4d", "q4e",
"q4f", "q4g", "q4h", "q4i", "q4j", "q4k", "q4l", "q4m", "q5b",
"q5c", "q5d", "q6b", "q6c", "q6f", "q7b", "q7d", "q7f", "q7g",
"q7h", "q7i", "q8a", "q8b", "q9a", "q9b", "q9c", "q9d", "q10a",
"q10b", "q10c", "q10d")
谢谢!
这不是使用foreach
的方法,而是使用dplyr
包和purrr
包的方法。此方法将允许您扩展已完成的操作以进行进一步分析。
# first load the packages
library(dplyr)
library(purrr)
# read in data
pgy <- read.table(text = "
1 2 3 4 5 1 1
2 1 2 3 4 2 1
3 2 1 5 2 1 2
2 1 2 1 1 2 2
2 3 4 1 2 3 3",
col.names = c("q3a", "q3b", "q3c",
"q3d", "q3e", "q10d", "grp"))
# data manipulation to get dataframe in required format
results <- pgy %>%
# use this mutate call to convert grp to factor
# you may not need to do this if that column is already a factor
mutate(grp = factor(grp)) %>%
# this gather function will convert the table from wide to long
# this is necessary to do the group_by() %>% nest() combo
# the q3a:q10d should work for you if that is the range of questions
# if not then you can change it to the first and last column names
# containing question data - these should be unquoted
gather(key = "question", value = "result", q3a:q10d) %>%
# this creates groupings based on which question
group_by(question) %>%
# this creates nested dataframes for each group - known as list-columns
# the column name of the nested dataframes will be data unless
# explicitly specified
nest() %>%
# this is where it gets a little tricky
# you can use a mutate call to create a new column, but since your
# data for each column is now stored in a nested list, you need to use
# map from the purrr package to operate on it.
# the ~ symbol is a short cut to allow you to avoid writing an anonymous function
# the .x in the aov function represents the data object in the map call
# notice that data is also the name of the listed column
mutate(aov_test = map(data, ~ aov(result ~ grp, data = .x)),
# this is the real magic of this method, you can use the variable created
# in the first line on the mutate function as the input to the
# second line for your PostHocTest input
duncan = map(aov_test, ~DescTools::PostHocTest(x = .x, method = "duncan")))
然后,您可以像这样访问邓肯测试的结果:
results$duncan[[1]]
或查看所有内容:
map(1:nrow(results), ~ results$duncan[[.x]])
对于 aov 测试,您可以使用 broom
包在整洁的数据框中获取结果,如下所示:
results %>%
unnest(map(aov_test, broom::tidy), .drop = TRUE)
您可以查看其他主要broom
函数(augment()
和glance()
(以查看有关模型的其他信息。不幸的是,您正在做的邓肯测试似乎没有整洁。
这是我使用 lapply
的问题的简洁解决方案:
## list of names of my data
> dput(names(duncan))
c("grp", "q3a", "q3b", "q3c", "q3d", "q3e", "q4a", "q4b", "q4d",
"q4e", "q4f", "q4g", "q4h", "q4i", "q4j", "q4k", "q4l", "q4m",
"q5b", "q5c", "q5d", "q6b", "q6c", "q6f", "q7b", "q7d", "q7f",
"q7g", "q7h", "q7i", "q8a", "q8b", "q9a", "q9b", "q9c", "q9d",
"q10a", "q10b", "q10c", "q10d")
## use lapply to run model for each variable
results <- lapply(duncan[,-1], function(x) PostHocTest(aov(x ~ pgy, data = duncan), method = "duncan"))
## extract results for each variable
results$q3a
Posthoc multiple comparisons of means : Duncan's new multiple range test
95% family-wise confidence level
$pgy
diff lwr.ci upr.ci pval
PGY2-PGY1 0.10716048 0.04104755 0.17327342 0.0012 **
PGY3-PGY1 0.05197694 -0.01485439 0.11880828 0.1274
PGY3-PGY2 -0.05518354 -0.12593368 0.01556661 0.1263
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
results$q3b
etc..
我在这里找到了类似的解决方案:每个自变量单独针对因变量的线性回归循环