r语言 - 如何在 h2o.stackedEnsemble() 中指定offset_column



我正在运行gbm 和 glm,offset_column作为h2o中的基础学习者。我的响应变量是二进制的,offset_column是正常数。基础学习者工作。这是代码:

train["offset"]<-train["log_hazard"] # offset column in the training set
my_gbm <- h2o.gbm(x = x, y = y, training_frame = train,
fold_column = "fold_id",
keep_cross_validation_predictions = TRUE,
offset_column = "offset",
seed = 1) 
my_glm <- h2o.glm(x = x, y = y, training_frame = train,
fold_column = "fold_id",
keep_cross_validation_predictions = TRUE,
offset_column = "offset",
seed = 1,family = "binomial")

然后我正在通过metalerner_params传递offset_columnh2o.stackedEnsemble().这是代码:

stack_model <- h2o.stackedEnsemble(x = x,
y = y,
training_frame = train,
base_models = list(my_gbm, my_glm),
metalearner_params = list(offset_column = "offset"))

但是我收到以下错误:

场上的 ERRR:_offset_column:在训练框架中找不到偏移列"偏移">

offset_column在训练数据中。我不确定为什么会收到此错误消息。

然后我尝试在没有metalerner_params选项的情况下运行h2o.stackedEnsemble()。这是代码:

stack_model <- h2o.stackedEnsemble(x = x,
y = y,
training_frame = train,
base_models = list(my_gbm, my_glm))

并收到以下警告消息:

警告消息: In .h2o.startModelJob(algo, params, h2oRestApiVersion( : 删除坏列和常量列:[偏移量]。

我不确定它是否正常运行。任何人都可以帮助我解决这个问题吗?

如果你仔细阅读H2O文档h2o.stackedEnsemble那么你就会意识到H2O元学习者不再需要偏移参数,因为它将使用来自基本模型的交叉验证预测值来训练:

my_gbm <- h2o.gbm(x = x, y = y, training_frame = train, 
fold_column = "fold_id",
keep_cross_validation_predictions = TRUE,
offset_column = "offset",
seed = 1) 
my_glm <- h2o.glm(x = x, y = y, training_frame = train, 
fold_column = "fold_id",
keep_cross_validation_predictions = TRUE,
offset_column = "offset",
seed = 1,family = "binomial")
stack_model <- h2o.stackedEnsemble(x = x,
y = y,
training_frame = train,
base_models = list(my_gbm, my_glm))
h2o.performance(my_gbm, newdata = test)
h2o.performance(my_glm, newdata = test)
h2o.performance(stack_model, newdata = test)

相关内容

  • 没有找到相关文章

最新更新