r语言 - 如何使用多重插补数据集(从 MICE 生成)获取类型 3 F 值和边际效应



我使用 MICE 包创建了一个包含 7 个插补数据集的 MI 数据集

imputeddata <- mice(distress_tibmi, m=7)

我的数据结构现在为:

..$ id            : num [1:342] 4 8 10 11 23 32 40 47 48 56 ...
..$ diagnosis     : Factor w/ 2 levels "psychosis","bpd": 1 1 1 1 1 1 1 1 1 1 ...
..$ gender        : Factor w/ 2 levels "female","male": 1 2 2 2 2 1 1 1 1 1 ...
..$ distress.time : Factor w/ 2 levels "baseline","post": 1 1 1 1 1 1 1 1 1 1 ...
..$ distress.score: num [1:342] -2.436 -1.242 0.251 -1.54 0.549 ...
..$ depression    : num [1:342] 0.332 0.542 1.172 -0.298 1.172 ...
..$ anxiety       : num [1:342] -1.898 -0.687 0.87 -0.687 1.043 ...
..$ choice        : num [1:342] 6.73 2.18 2 6.45 3.55 ...
$ imp            :List of 8
..$ id            :'data.frame':  0 obs. of  7 variables:
.. ..$ 1: logi(0) 
.. ..$ 2: logi(0) 
.. ..$ 3: logi(0) 
.. ..$ 4: logi(0) 
.. ..$ 5: logi(0) 
.. ..$ 6: logi(0) 
.. ..$ 7: logi(0) 
..$ diagnosis     :'data.frame':  0 obs. of  7 variables:
.. ..$ 1: logi(0) 
.. ..$ 2: logi(0) 
.. ..$ 3: logi(0) 
.. ..$ 4: logi(0) 
.. ..$ 5: logi(0) 
.. ..$ 6: logi(0) 
.. ..$ 7: logi(0) 
..$ gender        :'data.frame':  0 obs. of  7 variables:
.. ..$ 1: logi(0) 
.. ..$ 2: logi(0) 
.. ..$ 3: logi(0) 
.. ..$ 4: logi(0) 
.. ..$ 5: logi(0) 
.. ..$ 6: logi(0) 
.. ..$ 7: logi(0) 
..$ distress.time :'data.frame':  0 obs. of  7 variables:
.. ..$ 1: logi(0) 
.. ..$ 2: logi(0) 
.. ..$ 3: logi(0) 
.. ..$ 4: logi(0) 
.. ..$ 5: logi(0) 
.. ..$ 6: logi(0) 
.. ..$ 7: logi(0) 
..$ distress.score:'data.frame':  59 obs. of  7 variables:
.. ..$ 1: num [1:59] -0.6808 -0.6448 -1.658 -0.0293 -0.3463 ...
.. ..$ 2: num [1:59] 1.2736 0.2507 -0.0478 -0.6448 1.2736 ...
.. ..$ 3: num [1:59] -0.681 0.848 -1.658 1.274 0.251 ...
.. ..$ 4: num [1:59] -1.3322 -0.0478 -0.6808 -0.355 -2.4358 ...
.. ..$ 5: num [1:59] -1.3322 -0.355 -4.8239 -0.6448 -0.0293 ...
.. ..$ 6: num [1:59] -1.3322 0.5493 -0.0293 -2.6352 0.8478 ...
.. ..$ 7: num [1:59] 0.5493 0.2507 1.1463 -0.0478 1.2736 ...
..$ depression    :'data.frame':  24 obs. of  7 variables:
.. ..$ 1: num [1:24] -0.0882 -0.5084 -1.2966 0.542 -2.1891 ...
.. ..$ 2: num [1:24] 0.332 0.255 1.592 0.752 0.945 ...
.. ..$ 3: num [1:24] -2.159 0.332 -0.262 0.962 1.382 ...
.. ..$ 4: num [1:24] -0.2621 -0.0897 -1.7689 1.1172 0.7724 ...
.. ..$ 5: num [1:24] 0.122 -2.159 -2.399 1.462 -2.189 ...
.. ..$ 6: num [1:24] -0.298 -0.434 -0.607 1.172 0.962 ...
.. ..$ 7: num [1:24] 0.6 1.29 1.635 0.542 0.428 ...
..$ anxiety       :'data.frame':  10 obs. of  7 variables:
.. ..$ 1: num [1:10] 0.909 -1.379 1.389 -1.268 -0.598 ...
.. ..$ 2: num [1:10] 1.0433 -1.3789 -0.0955 -0.7655 -0.598 ...
.. ..$ 3: num [1:10] 1.0771 -1.8979 -0.0955 -0.5138 0.0052 ...
.. ..$ 4: num [1:10] -0.598 -1.603 0.9095 -2.608 -0.0955 ...
.. ..$ 5: num [1:10] 0.742 0.2395 -1.7249 -2.1055 -0.0955 ...
.. ..$ 6: num [1:10] 1.412 -0.86 1.389 -2.608 0.575 ...
.. ..$ 7: num [1:10] 1.245 -1.033 0.909 0.909 -1.033 ...
..$ choice        :'data.frame':  22 obs. of  7 variables:
.. ..$ 1: num [1:22] 4.55 3.91 7.09 4.27 3.55 ...
.. ..$ 2: num [1:22] 8.09 5.09 5.36 4.91 4.45 ...
.. ..$ 3: num [1:22] 4.27 7.09 3.91 3.91 7.09 ...
.. ..$ 4: num [1:22] 5.82 6.27 7 6.82 4.73 ...
.. ..$ 5: num [1:22] 6.18 5.36 5.36 3.18 3.18 ...
.. ..$ 6: num [1:22] 6.18 6.73 4.73 4.73 5 ...
.. ..$ 7: num [1:22] 5.45 7.09 7.45 3.18 4.91 ...
$ m              : num 7
$ where          : logi [1:342, 1:8] FALSE FALSE FALSE FALSE FALSE FALSE ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:342] "1" "2" "3" "4" ...
.. ..$ : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ blocks         :List of 8
..$ id            : chr "id"
..$ diagnosis     : chr "diagnosis"
..$ gender        : chr "gender"
..$ distress.time : chr "distress.time"
..$ distress.score: chr "distress.score"
..$ depression    : chr "depression"
..$ anxiety       : chr "anxiety"
..$ choice        : chr "choice"
..- attr(*, "calltype")= Named chr [1:8] "type" "type" "type" "type" ...
.. ..- attr(*, "names")= chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ call           : language mice(data = distress_tibmi, m = 7)
$ nmis           : Named int [1:8] 0 0 0 0 59 24 10 22
..- attr(*, "names")= chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ method         : Named chr [1:8] "" "" "" "" ...
..- attr(*, "names")= chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ predictorMatrix: num [1:8, 1:8] 0 1 1 1 1 1 1 1 1 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
.. ..$ : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ visitSequence  : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ formulas       :List of 8
..$ id            :Class 'formula'  language id ~ 0 + diagnosis + gender + distress.time +      distress.score + depression +  ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00> 
..$ diagnosis     :Class 'formula'  language diagnosis ~ 0 + id + gender + distress.time +      distress.score + depression +  ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00> 
..$ gender        :Class 'formula'  language gender ~ 0 + id + diagnosis + distress.time +      distress.score + depression +  ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00> 
..$ distress.time :Class 'formula'  language distress.time ~ 0 + id + diagnosis +      gender + distress.score + depression +  ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00> 
..$ distress.score:Class 'formula'  language distress.score ~ 0 + id + diagnosis +      gender + distress.time + depression +  ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00> 
..$ depression    :Class 'formula'  language depression ~ 0 + id + diagnosis +      gender + distress.time + distress.score +  ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00> 
..$ anxiety       :Class 'formula'  language anxiety ~ 0 + id + diagnosis + gender +      distress.time + distress.score +  ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00> 
..$ choice        :Class 'formula'  language choice ~ 0 + id + diagnosis + gender +      distress.time + distress.score +  ...
.. .. ..- attr(*, ".Environment")=<environment: 0x7ff907cd9d00> 
$ post           : Named chr [1:8] "" "" "" "" ...
..- attr(*, "names")= chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
$ blots          :List of 8
..$ id            : list()
..$ diagnosis     : list()
..$ gender        : list()
..$ distress.time : list()
..$ distress.score: list()
..$ depression    : list()
..$ anxiety       : list()
..$ choice        : list()
$ seed           : logi NA
$ iteration      : num 5
$ lastSeedValue  : int [1:626] 10403 331 -1243825859 461242975 2057104913 -837414599 -54045022 1529270132 -105270003 -1459771035 ...
$ chainMean      : num [1:8, 1:5, 1:7] NaN NaN NaN NaN -0.727 ...
..- attr(*, "dimnames")=List of 3
.. ..$ : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
.. ..$ : chr [1:5] "1" "2" "3" "4" ...
.. ..$ : chr [1:7] "Chain 1" "Chain 2" "Chain 3" "Chain 4" ...
$ chainVar       : num [1:8, 1:5, 1:7] NA NA NA NA 2.26 ...
..- attr(*, "dimnames")=List of 3
.. ..$ : chr [1:8] "id" "diagnosis" "gender" "distress.time" ...
.. ..$ : chr [1:5] "1" "2" "3" "4" ...
.. ..$ : chr [1:7] "Chain 1" "Chain 2" "Chain 3" "Chain 4" ...
$ loggedEvents   : NULL
$ version        :Classes 'package_version', 'numeric_version'  hidden list of 1
..$ : int [1:3] 3 9 0
$ date           : Date[1:1], format:  ...
- attr(*, "class")= chr "mids"
Show in New WindowClear OutputExpand/Collapse Output
id             diagnosis      gender   
Min.   :  1.00   psychosis:250   female:196  
1st Qu.: 76.75   bpd      : 92   male  :146  
Median :198.00                               
Mean   :215.66                               
3rd Qu.:337.00                               
Max.   :514.00                               
distress.time distress.score      depression      
baseline:171   Min.   :-4.8239   Min.   :-2.39920  
post    :171   1st Qu.:-0.6808   1st Qu.:-0.76410  
Median :-0.0293   Median : 0.08280  
Mean   :-0.3083   Mean   :-0.06085  
3rd Qu.: 0.6221   3rd Qu.: 0.77240  
Max.   : 1.2736   Max.   : 1.80690  
NA's   :59        NA's   :24        
anxiety            choice      
Min.   :-2.6080   Min.   :0.0909  
1st Qu.:-0.9330   1st Qu.:2.4545  
Median :-0.0955   Median :4.0454  
Mean   :-0.1397   Mean   :3.8903  
3rd Qu.: 0.8702   3rd Qu.:5.1136  
Max.   : 1.7471   Max.   :8.0909  
NA's   :10        NA's   :22      
Show in New WindowClear OutputExpand/Collapse Output

1
<dbl>
2
<dbl>
3
<dbl>
4
<dbl>
5
<dbl>
6
<dbl>
7
<dbl>
21  -0.6808 1.2736  -0.6808 -1.3322 -1.3322 -1.3322 0.5493
34  -0.6448 0.2507  0.8478  -0.0478 -0.3550 0.5493  0.2507
48  -1.6580 -0.0478 -1.6580 -0.6808 -4.8239 -0.0293 1.1463
141 -0.0293 -0.6448 1.2736  -0.3550 -0.6448 -2.6352 -0.0478
143 -0.3463 1.2736  0.2507  -2.4358 -0.0293 0.8478  1.2736
180 1.1463  -1.0065 -2.3094 -3.6124 -0.6448 -1.5403 -1.0065
181 -0.0293 -0.6808 -0.6808 -3.9381 -0.3463 -1.3322 0.2964
182 1.2736  -0.3463 0.9479  -0.0478 0.9479  -0.3463 1.1463
197 -0.3550 -0.0293 -0.6808 -0.3550 -1.3322 -4.8239 -0.6448
208 0.6221  0.2507  -0.6808 -0.3550 -0.6448 0.6221  -0.6448
1-10 of 59 rows 

我用插补数据集创建了一个 lm 并使用 pool(( 对其进行了总结

distressmodel <- with(data = imputeddata, exp = lm(distress.score ~ distress.time * diagnosis))
summary(mice::pool(distressmodel), conf.int = TRUE, conf.level = 0.95 )

但是现在我想获取模型的 3 F 类型值,但此代码不起作用

car::Anova(mice::pool(distressmodel), type = 3)

它会产生以下错误消息:

UseMethod("vcov"( 中的错误:没有适用于类 "c('mipo', 'data.frame'("的对象 'vcov' 的方法

我还想获得模型的边际效应(例如,仅查看分组变量的一个级别的效应,即诊断(,我已经在我的完整案例分析中成功完成了,但是这段代码:

summary(margins(distressmodel, data = subset(imputeddata, diagnosis == "bpd", type = "response")))

产生此错误

subset_datlist中的错误(datlist = x,子集 = 子集,选择 = 选择,:找不到对象"诊断">

有没有人对代码的更改或让 car::anova 或边距 (( 包使用 MI 数据集的方法有任何建议?(最好能够合并结果

with(data, exp)程序可用于将统计检验/模型应用于多个插补输出(mipo(,前提是它们允许用coef方法提取估计值,并使用vcov提取方差-协方差矩阵。后者似乎不适用于您使用的函数car::Anova

幸运的是,有miceadds包,它提供了进行和汇集其他统计测试的程序。miceadds::mi.anova似乎完全符合您的要求:

miceadds::mi.anova(imputeddata, distress.score ~ distress.time * diagnosis, type=3)

然而,我不确定边际影响。通常,您可以进行更多的编码,并将任何统计过程分别应用于每个插补样本。然后,您可以使用pool.scalar函数将其池化。此方法还为您提供合并统计量的插补内方差、插补间方差和总方差估计值。(如果需要,您可以对 0 的差异进行基本的 t 检验。

这种方法依赖于统计数据的正态分布,或者依赖于它们可以转换为正态分布的指标。(Stef van Buuren 给出了一个统计数据列表,这些统计数据可以很容易地转换、池化和反向转换,见表 5.2(所以你想要的边际手段应该是可能的,对吧?

我不知道您使用的margins函数(它来自什么包?但是,如果您想获得边际手段并自己汇集它们,则此方法如下:

# transform your mids into a long-format data frame
imputed_l <- mice::complete(imputeddata, action="long")
nimp <- imputed_l$m #number of imputations for convenience
# create vectors to contain the marginal effects and their SEs from all seven imputations
mm_all <- vector("numeric", nimp)
mmse_all <- mm_all
# get marginal means and SEs for all imputations
for (i in 1:nimp) {
mm_all[i] <- Expression_producing_marginal_mean(..., data = subset(imputed_l, .imp=i) )
mmse_all[i] <- Expression_producing_SE(..., data = subset(imputed_l, .imp=i) )
}
# pool them (the U argument should be variances, so square the SEs)
mm_pool <- pool.scalar(Q=mm_all, U=mmse_all^2, n=nrow(imputed_l)/nimp)
mm_pool$qbar #marginal mean aggregated across imputations
sqrt(mm_pool$t) #SE of marginal mean (based on within- and between-imputations variance)

相关内容

  • 没有找到相关文章

最新更新