数据集在 R 中使用 H2O 深度学习包具有分类响应列



我想,我在 R 中使用 H2O 平台时遇到了数据广播问题。

这是错误:

错误:water.exceptions.H2OModelBuilderIllegalArgumentException:GBM 模型的非法参数:GBM_model_R_1568616391145_4。 详细信息:字段上的 ERRR:_validation_frame:测试/验证数据集具有分类响应列"gold",与模型没有共同的水平

这是代码:

library(h2o)
kd_h2o = h2o.init(nthreads = -1)
data = readxl::read_excel("C:\Users\frzd\Desktop\mtx.xlsx")
data_order <- data[order(data$gold),]
data_order$gold=h2o.asfactor(data_order$gold)
Split_ts = .2
Split_vl = .1
indx <- 1:round(length(data$gold)*Split_ts)
ts <- max(indx)
ts <- round(indx*length(data$gold)/ts)
test = as.h2o(data_order[ts,])
train = data_order[-ts,]
indx <- 1:round(length(train$gold)*Split_vl)
ts <- max(indx)
ts <- round(indx*length(train$gold)/ts)
valid = as.h2o(train[ts,])
train = as.h2o(train[-ts,])
fit <- h2o.gbm(y = 15, 
training_frame = train, 
validation_frame=valid,
# cvControl = list(V = 5),
)

你们能帮帮我吗?:)

我想通了

这是因为数据框使用错误。 这是更正后的代码:

# initializing the H2O service via internet
h2o.init(nthreads = -1)
# data preperation
data = readxl::read_excel("C:\Users\frzd\Desktop\mtx.xlsx")
data_order <- data[order(data$gold),]
data_order=h2o.asfactor(data_order)
# data split
Split_ts = .2
Split_vl = .1
indx <- 1:round(length(data$gold)*Split_ts)
ts <- max(indx)
ts <- round(indx*length(data$gold)/ts)
test = as.h2o(data_order[ts,])
train = data_order[-ts,]
indx <- 1:round(length(train$gold)*Split_vl)
ts <- max(indx)
ts <- round(indx*length(train$gold)/ts)
valid = as.h2o(train[ts,])
train = as.h2o(train[-ts,])
# perform fitting
fit <- h2o.gbm(y = 15, 
distribution= "gaussian",
training_frame = train, 
validation_frame=valid
# cvControl = list(V = 5),
)

最新更新