我想从加权Mann-Whitney U检验中引导p值和标准误差。
我可以用weighted_mannwhitney(c12hour~c161sex+weight,efc(运行测试,这很好,但我不完全确定如何运行相同的自举版本来获得自举p值。
library(sjstats) # weighted Mann-Whitney
library(tidyverse) # main workflow, which has purrr and forcats (IIRC)
# library(broom) # for tidying model output, but not directly loaded
library(modelr) # for bootstrap
data(efc)
efc$weight <- abs(rnorm(nrow(efc), 1, .3))
# weighted Mann-Whitney-U-test ----
weighted_mannwhitney(c12hour ~ c161sex + weight, efc)
# Bootstrapping
set.seed(1000) # for reproducibility
boot_efc <- efc %>% bootstrap(1000)
# Throws error!
boot_efc %>%
dplyr::mutate(c12hour = map(strap, ~weighted_mannwhitney(c12hour ~ c161sex + weight, data = .)),
tidy = map(c12hour, broom::tidy)) -> boot_efc_out
旁注:加权Mann-Whitney测试的软件包有自己的引导函数,可以如下所示使用该函数来获得引导标准误差和自举p值,但这是在运行不同的函数(平均值(,我无法将其适用于加权Mann-惠特尼。不确定这是否有助于
# or as tidyverse-approach
if (require("dplyr") && require("purrr")) {
bs <- efc %>%
bootstrap(100) %>%
mutate(
c12hour = map_dbl(strap, ~mean(as.data.frame(.x)$c12hour, na.rm = TRUE))
)
# bootstrapped standard error
boot_se(bs, c12hour)
# bootstrapped p-value
boot_p(bs, c12hour)
}
这应该可以做到:
library(sjstats) # weighted Mann-Whitney
library(tidyverse) # main workflow, which has purrr and forcats (IIRC)
# library(broom) # for tidying model output, but not directly loaded
library(modelr) # for bootstrap
#>
#> Attaching package: 'modelr'
#> The following objects are masked from 'package:sjstats':
#>
#> bootstrap, mse, rmse
data(efc)
efc$weight <- abs(rnorm(nrow(efc), 1, .3))
# weighted Mann-Whitney-U-test ----
mw_full <- weighted_mannwhitney(c12hour ~ c161sex + weight, efc)
# Bootstrapping
set.seed(1000) # for reproducibility
boot_efc <- efc %>% bootstrap(1000)
tmp <- boot_efc %>%
dplyr::mutate(c12hour = map_dbl(strap,
~sjstats:::weighted_mannwhitney.formula(c12hour ~ c161sex + weight,
data = .$data[.$idx, ])$estimate))
boot_p(tmp$c12hour)
# term p.value
# 1 x 0.0316659
创建于2022-09-07由reprex包(v2.0.1(
请注意,其中一个问题来自weighted_mannwhitney()
在map()
函数内部的工作方式。当它被调用时,它调用default
方法而不是formula
方法,然后生成一个错误。您可以像我在代码中所做的那样,为统计数据调用formula
方法。另一个问题是boot_efc
的每个strap
元素不是单个数据帧,它有两个元素data
和idx
,其中idx
元素是自举观测数。因此,您需要使用.$data[.$idx, ]
作为引导数据。下面是您发布的示例。