重新编码R中鼠标库mids对象中的一个变量



以下数据集可用

data <- structure(list(asdas_6month = c(23.1222666868239, 25.4056847196073, 
25.9886630231065, NA, 26.9450864282904, 15.1832953552198, 22.1618055512694, 
NA, 24.1387146612986, 25.598233740795, 22.6844495409994, 25.0138310842063, 
20.9944595011522, 17.0762423377328, NA, NA, 20.2359010676347, 
17.5468970969989, 22.9765676870538, 26.3032333127368, NA, NA, 
NA, 17.3203951667699, 19.126959104744), gender = structure(c(1L, 
2L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 
2L, 2L, 1L, 1L, 2L, 2L, 1L, 2L), .Label = c("Female", "Male"), class = "factor"), 
age = c(47.9379517873091, 46.837373193357, 48.5646295793097, 
43.1378807456583, 60.3619393447192, 70.1290549397305, 84.3587981654008, 
59.2292347942614, 41.7327157246053, 52.0137845399698, 55.0951441078166, 
71.6184307122057, 43.3101374804154, 33.5854501557607, 51.9032470737109, 
68.1204996602706, 42.9427562299075, 55.909031412815, 29.895500127283, 
20.9296411673894, 29.3957377286062, 46.974102661638, 54.6740110130539, 
42.6997039072135, 67.3413773507263), asdas_baseline = c(63.7251494911822, 
NA, 65.0638161875852, 70.1816100941605, 53.1972327260365, 
62.980030777934, 60.3085321252511, 58.9998256902073, 56.8045598820947, 
54.4446059090559, NA, 61.7293600038226, 56.5674724119214, 
62.8593507709476, NA, 54.9028311743253, NA, NA, 67.6467591815449, 
58.5134614505046, 59.3735346553234, 51.9158516755166, 63.0645651881476, 
58.7759004270177, 55.0687922895208)), class = "data.frame", row.names = c(NA, 
-25L))

以下是它的外观:

'data.frame':   25 obs. of  4 variables:
$ asdas_6month  : num  23.1 25.4 26 NA 26.9 ...
$ gender        : Factor w/ 2 levels "Female","Male": 1 2 1 1 1 1 2 2 2 1 ...
$ age           : num  47.9 46.8 48.6 43.1 60.4 ...
$ asdas_baseline: num  63.7 NA 65.1 70.2 53.2 ...

使用以下代码,我可以从mice库生成一个mids对象,并创建五个估算数据集

library(mice)    
new_imp <- mice(data, m=5, maxit=10, print = FALSE, seed = 449)
print(new_imp)
Number of multiple imputations:  5 
Imputation methods:
asdas_6month         gender            age asdas_baseline 
"pmm"             ""             ""          "pmm" 
PredictorMatrix:
asdas_6month gender age asdas_baseline
asdas_6month              0      1   1              1
gender                    1      0   1              1
age                       1      1   0              1
asdas_baseline            1      1   1              0

我的目标是重新编码一个新变量并创建asdas_improvement,即6个月时ASDAS评分中的>30 improvement或更多。通常,我可以用dplyr的突变函数计算如下:

library(dplyr)
data %>%
mutate(asdas_improvement = if_else(asdas_baseline - asdas_6month >= 40, 1, 0))

如何在mids对象中重新编码类似的变量?

要对估算数据集进行计算,我们可以使用complete来获得估算数据的数据帧。然后,我们可以使用mutate作为正常值来进行计算。然后,可以使用as.mids将其转换回mids对象。

library(dplyr)
full.impdata <- complete(new_imp, 'long', include = TRUE) %>%
mutate(asdas_improvement = if_else(asdas_baseline - asdas_6month >= 40, 1, 0))
new_imp <- as.mids(full.impdata)

输出

str(new_imp$imp$asdas_improvement)
'data.frame':   11 obs. of  5 variables:
$ 1: num  0 1 0 0 1 0 0 0 1 0 ...
$ 2: num  0 1 0 0 0 0 0 1 0 0 ...
$ 3: num  0 1 0 0 0 0 1 1 0 0 ...
$ 4: num  0 1 1 0 0 0 0 1 0 0 ...
$ 5: num  0 1 0 0 0 0 0 0 1 0 ...

最新更新