r-自举标准误差和加权mann-whitney检验的p值



我想从加权Mann-Whitney U检验中引导p值和标准误差。

我可以用weighted_mannwhitney(c12hour~c161sex+weight,efc(运行测试,这很好,但我不完全确定如何运行相同的自举版本来获得自举p值。

library(sjstats) # weighted Mann-Whitney
library(tidyverse) # main workflow, which has purrr and forcats (IIRC)
# library(broom) # for tidying model output, but not directly loaded
library(modelr) # for bootstrap

data(efc)
efc$weight <- abs(rnorm(nrow(efc), 1, .3))
# weighted Mann-Whitney-U-test ----
weighted_mannwhitney(c12hour ~ c161sex + weight, efc)

# Bootstrapping
set.seed(1000) # for reproducibility
boot_efc <- efc %>% bootstrap(1000) 

# Throws error!
boot_efc %>% 
dplyr::mutate(c12hour = map(strap, ~weighted_mannwhitney(c12hour ~ c161sex + weight, data = .)),
tidy = map(c12hour, broom::tidy)) -> boot_efc_out

旁注:加权Mann-Whitney测试的软件包有自己的引导函数,可以如下所示使用该函数来获得引导标准误差自举p值,但这是在运行不同的函数(平均值(,我无法将其适用于加权Mann-惠特尼。不确定这是否有助于

# or as tidyverse-approach
if (require("dplyr") && require("purrr")) {
bs <- efc %>%
bootstrap(100) %>%
mutate(
c12hour = map_dbl(strap, ~mean(as.data.frame(.x)$c12hour, na.rm = TRUE))
)
# bootstrapped standard error
boot_se(bs, c12hour)
# bootstrapped p-value
boot_p(bs, c12hour)
}

这应该可以做到:

library(sjstats) # weighted Mann-Whitney
library(tidyverse) # main workflow, which has purrr and forcats (IIRC)
# library(broom) # for tidying model output, but not directly loaded
library(modelr) # for bootstrap
#> 
#> Attaching package: 'modelr'
#> The following objects are masked from 'package:sjstats':
#> 
#>     bootstrap, mse, rmse

data(efc)
efc$weight <- abs(rnorm(nrow(efc), 1, .3))
# weighted Mann-Whitney-U-test ----
mw_full <- weighted_mannwhitney(c12hour ~ c161sex + weight, efc)

# Bootstrapping
set.seed(1000) # for reproducibility
boot_efc <- efc %>% bootstrap(1000) 
tmp <- boot_efc %>% 
dplyr::mutate(c12hour = map_dbl(strap, 
~sjstats:::weighted_mannwhitney.formula(c12hour ~ c161sex + weight, 
data = .$data[.$idx, ])$estimate))
boot_p(tmp$c12hour)
#   term   p.value
# 1    x 0.0316659

创建于2022-09-07由reprex包(v2.0.1(

请注意,其中一个问题来自weighted_mannwhitney()map()函数内部的工作方式。当它被调用时,它调用default方法而不是formula方法,然后生成一个错误。您可以像我在代码中所做的那样,为统计数据调用formula方法。另一个问题是boot_efc的每个strap元素不是单个数据帧,它有两个元素dataidx,其中idx元素是自举观测数。因此,您需要使用.$data[.$idx, ]作为引导数据。下面是您发布的示例。