插入符号 R:在选择调整参数后确定整个训练数据集的性能

我正在使用 R 插入符号，对于在选择调整参数并使用这些参数评估整个训练数据集后，我对如何获取性能指标(在我的例子中为 RMSE 和 R 平方(有点困惑。

下面是部分训练模型的输出

eXtreme Gradient Boosting
Resampling: Cross-Validated (5 fold, repeated 5 times)
Summary of sample sizes: 1771, 1769, 1772, 1770, 1770, 1770, ...
Resampling results across tuning parameters:
   lambda  alpha  nrounds  RMSE      Rsquared   MAE
   0e+00   0e+00   50      1.964635  0.6504540  1.269607
   and ~25 more tuning parameters ...
   1e-01   1e-01  150      1.970099  0.6517576  1.252826
RMSE was used to select the optimal model using the smallest value.
The final values used for the model were nrounds = 50, lambda = 1e-04, alpha = 1e-04 and eta = 0.3

到目前为止还行。我想我了解重采样数据的 RMSE 和 Rsquared 以及如何选择调谐参数。

我的问题，当调优参数应用于整个训练数据集时，如何获得 RMSE 和 RSquared？
是上面的性能指标吗？

有一个名为getTrainPerf的函数，

根据文档，"函数getTrainPerf返回一个单行数据帧，其中包含所选模型的重采样结果"。这只是获得最佳重采样调谐的简单方法。

我是不是想多了？

谢谢！！！

重新预测训练集不是一个好主意。从getTrainPerf获得的结果可能是模型在整个训练集上工作情况的良好估计。

我是不是想多了？

有点，是的。不过，这相当复杂。

相关内容

最新更新

热门标签：