准备数据(生存包中的"卵巢"):
require(pacman)
p_load(mlr, survival, tidyverse, ranger)
data("ovarian")
ovarian$rx <- factor(ovarian$rx,
levels = c("1", "2"),
labels = c("A", "B"))
ovarian$resid.ds <- factor(ovarian$resid.ds,
levels = c("1", "2"),
labels = c("no", "yes"))
ovarian$ecog.ps <- factor(ovarian$ecog.ps,
levels = c("1", "2"),
labels = c("good", "bad"))
ovarian <- ovarian %>% mutate(age_group = ifelse(age >=50, "old", "young"))
ovarian$age_group <- factor(ovarian$age_group)
现在,使用包 'mlr' 运行,surv.ranger:
trainTask <- makeSurvTask(data = ovarian, target = c("futime", "fustat"))
trainLearner <- makeLearner("surv.ranger", predict.type = "response")
train(trainLearner,trainTask)
Error in `[.data.frame`(num.response, x == y) :
undefined columns selected
为什么会出现错误?如何解决?
然后我尝试使用另一个示例数据集(来自 mlr 包的"lung.task"),但得到另一个错误:
trainLearner <- makeLearner("surv.ranger", predict.type = "response")
train(trainLearner,lung.task) # lung.task is from mlr package
Error in ranger::ranger(formula = NULL, dependent.variable.name = tn[1L], :
argument ".weights" is missing, with no default
我花了很长时间才发现它,但现在我得到了错误。它来自参数 respect.unordered.factors 在包 ranger 中,这也不起作用:
ranger::ranger(formula = NULL, dependent.variable.name = "futime", status.variable.name = "fustat", data = ovarian, respect.unordered.factors = "order")
要解决它,您可以将其设置为另一个值:
lrn <- makeLearner("surv.ranger", predict.type = "response", respect.unordered.factors = "partition")
lrn <- makeLearner("surv.ranger", predict.type = "response", respect.unordered.factors = "order")
编辑:在github的最新ranger版本中,此错误不再出现。若要安装它,请使用以下命令并重新启动 R:
devtools::install_github("imbs-hl/ranger")
另请参阅此处:https://github.com/imbs-hl/ranger/issues/359