r语言 - 拟合的coxph模型随地层和簇的时变而随公式项的顺序变化



我在R中的survival包中遇到了奇怪的行为,在我直接联系包维护者之前,我希望比我聪明的人能告诉我这是不是一个bug,而是一些统计上更微妙的东西(因此我在这里发帖)。

上下文是,我想拟合一个基于时间层的时变系数的Cox模型,我还想要健壮的标准误差来解释聚类观察。令我惊讶的是,输出根据formula参数中项的顺序而变化,并且它似乎选择性地忽略了相互作用,这取决于它们是在集群参数之前还是之后。

请参阅下面的代表,包括我的评论,指出了我看起来奇怪的事情。由于

附加问题:您知道为什么在expected_fit中,agestrata(tgroup)之前出现,而ph.ecogstrata(tgroup)之后出现吗?
library(survival);
library(tidyverse);
# Create a fake cluster in the lung data
lung_expanded <- 
lung %>%
mutate(fake_id = sample(1:3, nrow(lung), replace = T))

# Model below is as expected: I get the expected time-stratified hazard 
# ratios and robust standard errors 
expected_fit <- 
survSplit(Surv(time, status) ~ ., 
data = lung_expanded, 
cut = c(200, 300), 
episode = "tgroup") %>%
coxph(formula = Surv(tstart, time, status) ~ 
# putting cluster prior to age:time interaction
# gives expected results
cluster(fake_id) + age:strata(tgroup) + ph.ecog:strata(tgroup))
expected_fit
#> Call:
#> coxph(formula = Surv(tstart, time, status) ~ age:strata(tgroup) + 
#>     strata(tgroup):ph.ecog, data = ., cluster = fake_id)
#> 
#>                                     coef exp(coef)  se(coef) robust se      z
#> age:strata(tgroup)tgroup=1      0.007912  1.007943  0.013858  0.010575  0.748
#> age:strata(tgroup)tgroup=2     -0.003444  0.996562  0.021995  0.014347 -0.240
#> age:strata(tgroup)tgroup=3      0.020987  1.021209  0.015680  0.006119  3.430
#> strata(tgroup)tgroup=1:ph.ecog  0.665031  1.944551  0.176550  0.177987  3.736
#> strata(tgroup)tgroup=2:ph.ecog  0.650765  1.917007  0.278808  0.032228 20.193
#> strata(tgroup)tgroup=3:ph.ecog  0.097243  1.102129  0.188521  0.131935  0.737
#>                                       p
#> age:strata(tgroup)tgroup=1     0.454370
#> age:strata(tgroup)tgroup=2     0.810300
#> age:strata(tgroup)tgroup=3     0.000604
#> strata(tgroup)tgroup=1:ph.ecog 0.000187
#> strata(tgroup)tgroup=2:ph.ecog  < 2e-16
#> strata(tgroup)tgroup=3:ph.ecog 0.461090
#> 
#> Likelihood ratio test=24.9  on 6 df, p=0.0003561
#> n= 462, number of events= 164 
#>    (1 observation deleted due to missingness)
# Model output below differs from above, but it varies only in the ordering of the
# formula. I only get the hazard ratios  corresponding to  age but not 
# those corresponding ecog status. moreover, I get an unexpected hazard 
# ratio corresponding to a newly created variable called 'cluster(fake_id)'
suspicious_fit <- 
survSplit(Surv(time, status) ~ ., 
data = lung_expanded, 
cut = c(200, 300), 
episode = "tgroup") %>%
coxph(formula = Surv(tstart, time, status) ~ 
# putting age:time interaction prior to the cluster 
# gives suspicious results
age:strata(tgroup) + cluster(fake_id) + ph.ecog:strata(tgroup))
suspicious_fit
#> Call:
#> coxph(formula = Surv(tstart, time, status) ~ cluster(fake_id) + 
#>     age:strata(tgroup), data = ., cluster = fake_id)
#> 
#>                                coef exp(coef) se(coef) robust se      z       p
#> cluster(fake_id)           -0.04020   0.96060  0.09952   0.05094 -0.789 0.43004
#> age:strata(tgroup)tgroup=1  0.01928   1.01947  0.01354   0.01477  1.306 0.19166
#> age:strata(tgroup)tgroup=2  0.01029   1.01034  0.02149   0.01782  0.578 0.56359
#> age:strata(tgroup)tgroup=3  0.02254   1.02279  0.01556   0.00655  3.441 0.00058
#> 
#> Likelihood ratio test=4.62  on 4 df, p=0.3288
#> n= 463, number of events= 165
# In the model below, which has no time-varying strata, 
# everything looks good. 
expected_fit2  <- 
lung_expanded %>%
coxph(formula = Surv(time, status) ~ 
# putting age:time interaction prior to the cluster 
# gives suspicious results
age + cluster(fake_id) + ph.ecog)
expected_fit2
#> Call:
#> coxph(formula = Surv(time, status) ~ age + ph.ecog, data = ., 
#>     cluster = fake_id)
#> 
#>             coef exp(coef) se(coef) robust se     z      p
#> age     0.011281  1.011345 0.009319  0.006714  1.68 0.0929
#> ph.ecog 0.443485  1.558128 0.115831  0.028093 15.79 <2e-16
#> 
#> Likelihood ratio test=19.06  on 2 df, p=7.279e-05
#> n= 227, number of events= 164 
#>    (1 observation deleted due to missingness)
sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Catalina 10.15.7
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#>  [1] forcats_0.5.1   stringr_1.4.0   dplyr_1.0.6     purrr_0.3.4    
#>  [5] readr_1.4.0     tidyr_1.1.3     tibble_3.1.2    ggplot2_3.3.3  
#>  [9] tidyverse_1.3.1 survival_3.2-11
#> 
#> loaded via a namespace (and not attached):
#>  [1] tidyselect_1.1.1  xfun_0.23         splines_4.1.0     haven_2.4.1      
#>  [5] lattice_0.20-44   colorspace_2.0-1  vctrs_0.3.8       generics_0.1.0   
#>  [9] htmltools_0.5.1.1 yaml_2.2.1        utf8_1.2.1        rlang_0.4.11     
#> [13] pillar_1.6.1      glue_1.4.2        withr_2.4.2       DBI_1.1.1        
#> [17] dbplyr_2.1.1      readxl_1.3.1      modelr_0.1.8      lifecycle_1.0.0  
#> [21] cellranger_1.1.0  munsell_0.5.0     gtable_0.3.0      rvest_1.0.0      
#> [25] evaluate_0.14     knitr_1.33        fansi_0.5.0       highr_0.9        
#> [29] Rcpp_1.0.7        broom_0.7.6       backports_1.2.1   scales_1.1.1     
#> [33] jsonlite_1.7.2    fs_1.5.0          hms_1.1.0         digest_0.6.27    
#> [37] stringi_1.6.2     grid_4.1.0        cli_2.5.0         tools_4.1.0      
#> [41] magrittr_2.0.1    crayon_1.4.1      pkgconfig_2.0.3   ellipsis_0.3.2   
#> [45] Matrix_1.3-4      xml2_1.3.2        reprex_2.0.0      lubridate_1.7.10 
#> [49] assertthat_0.2.1  rmarkdown_2.8     httr_1.4.2        rstudioapi_0.13  
#> [53] R6_2.5.0          compiler_4.1.0
Created on 2021-08-16 by the reprex package (v2.0.0)
```

我深入研究了survival::coxph代码,找出了有问题的代码行,并联系了Terry Therneau(软件包维护者),他证实了这是一个bug。他指出有"……虫子在…以R为底的公式,通过更新继承。公式和重新公式。他已经告诉我3.2-13版本修复了这个问题。

相关内容

  • 没有找到相关文章

最新更新