R mlr surv.ranger 错误 '[.data.frame'(num.response, x == y) :



准备数据(生存包中的"卵巢"):

require(pacman)
p_load(mlr, survival, tidyverse, ranger)
data("ovarian")
ovarian$rx <- factor(ovarian$rx, 
                     levels = c("1", "2"), 
                     labels = c("A", "B"))
ovarian$resid.ds <- factor(ovarian$resid.ds, 
                           levels = c("1", "2"), 
                           labels = c("no", "yes"))
ovarian$ecog.ps <- factor(ovarian$ecog.ps, 
                          levels = c("1", "2"), 
                          labels = c("good", "bad"))
ovarian <- ovarian %>% mutate(age_group = ifelse(age >=50, "old", "young"))
ovarian$age_group <- factor(ovarian$age_group)

现在,使用包 'mlr' 运行,surv.ranger:

trainTask <- makeSurvTask(data = ovarian, target = c("futime", "fustat"))
trainLearner <- makeLearner("surv.ranger", predict.type = "response")
train(trainLearner,trainTask)
Error in `[.data.frame`(num.response, x == y) : 
  undefined columns selected

为什么会出现错误?如何解决?

然后我尝试使用另一个示例数据集(来自 mlr 包的"lung.task"),但得到另一个错误:

trainLearner <- makeLearner("surv.ranger", predict.type = "response")
train(trainLearner,lung.task) # lung.task is from mlr package
Error in ranger::ranger(formula = NULL, dependent.variable.name = tn[1L],  : 
  argument ".weights" is missing, with no default

我花了很长时间才发现它,但现在我得到了错误。它来自参数 respect.unordered.factors 在包 ranger 中,这也不起作用:

ranger::ranger(formula = NULL, dependent.variable.name = "futime", status.variable.name = "fustat", data = ovarian, respect.unordered.factors = "order")

要解决它,您可以将其设置为另一个值:

lrn <- makeLearner("surv.ranger", predict.type = "response", respect.unordered.factors = "partition")
lrn <- makeLearner("surv.ranger", predict.type = "response", respect.unordered.factors = "order")

编辑:在github的最新ranger版本中,此错误不再出现。若要安装它,请使用以下命令并重新启动 R:

devtools::install_github("imbs-hl/ranger")

另请参阅此处:https://github.com/imbs-hl/ranger/issues/359

相关内容

最新更新