我正在运行gbm 和 glm,offset_column
作为h2o中的基础学习者。我的响应变量是二进制的,offset_column
是正常数。基础学习者工作。这是代码:
train["offset"]<-train["log_hazard"] # offset column in the training set
my_gbm <- h2o.gbm(x = x, y = y, training_frame = train,
fold_column = "fold_id",
keep_cross_validation_predictions = TRUE,
offset_column = "offset",
seed = 1)
my_glm <- h2o.glm(x = x, y = y, training_frame = train,
fold_column = "fold_id",
keep_cross_validation_predictions = TRUE,
offset_column = "offset",
seed = 1,family = "binomial")
然后我正在通过metalerner_params
传递offset_column
h2o.stackedEnsemble()
.这是代码:
stack_model <- h2o.stackedEnsemble(x = x,
y = y,
training_frame = train,
base_models = list(my_gbm, my_glm),
metalearner_params = list(offset_column = "offset"))
但是我收到以下错误:
场上的 ERRR:_offset_column:在训练框架中找不到偏移列"偏移">
offset_column
在训练数据中。我不确定为什么会收到此错误消息。
然后我尝试在没有metalerner_params
选项的情况下运行h2o.stackedEnsemble()
。这是代码:
stack_model <- h2o.stackedEnsemble(x = x,
y = y,
training_frame = train,
base_models = list(my_gbm, my_glm))
并收到以下警告消息:
警告消息: In .h2o.startModelJob(algo, params, h2oRestApiVersion( : 删除坏列和常量列:[偏移量]。
我不确定它是否正常运行。任何人都可以帮助我解决这个问题吗?
如果你仔细阅读H2O文档h2o.stackedEnsemble
那么你就会意识到H2O元学习者不再需要偏移参数,因为它将使用来自基本模型的交叉验证预测值来训练:
my_gbm <- h2o.gbm(x = x, y = y, training_frame = train,
fold_column = "fold_id",
keep_cross_validation_predictions = TRUE,
offset_column = "offset",
seed = 1)
my_glm <- h2o.glm(x = x, y = y, training_frame = train,
fold_column = "fold_id",
keep_cross_validation_predictions = TRUE,
offset_column = "offset",
seed = 1,family = "binomial")
stack_model <- h2o.stackedEnsemble(x = x,
y = y,
training_frame = train,
base_models = list(my_gbm, my_glm))
h2o.performance(my_gbm, newdata = test)
h2o.performance(my_glm, newdata = test)
h2o.performance(stack_model, newdata = test)