在mlr3的基准测试中使用预定义的训练集和测试集

我想使用mlr3中的benchmark_grid()函数来比较分类任务中的几种机器学习算法。根据https://mlr3book.mlr-org.com/benchmarking.html, benchmark_grid()采用重采样方案将任务中的数据划分为训练数据和测试数据。但是，我想使用手动分区。当使用benchmark_grid()时，我如何手动指定训练和测试集?

编辑:基于pat-s

建议的代码示例

# use benchmark() from mlr3 to compare different classification models on the iris data set using a manually
# pre-defined partitioning into training and test data sets (hold-out sampling)
library("mlr3verse")
# Instantiate Task
task = tsk("iris")
# Instantiate Custom Resampling
# hold-out sample with pre-defined partitioning into train and test set
custom = rsmp("custom")
train_sets = list(1:120)
test_sets = list(121:150)
custom$instantiate(task, train_sets, test_sets)

design = benchmark_grid(
tasks = task,
learners = lrns(c("classif.ranger", "classif.rpart", "classif.featureless"),
predict_type = "prob", predict_sets = c("train", "test")),
resamplings = custom
)
print(design)

# execute the benchmark
bmr = benchmark(design)
measure = msr("classif.acc")
tab = bmr$aggregate(measure)
print(tab)

可以使用"custom_cv"重采样方案

相关内容

最新更新

热门标签：