n_estimators和max_features在RandomForestRegressor中的含义

我正在阅读有关使用GridSearchCV进行微调模型的阅读，我遇到了下面显示的参数网格：

param_grid = [
{'n_estimators': [3, 10, 30], 'max_features': [2, 4, 6, 8]},
{'bootstrap': [False], 'n_estimators': [3, 10], 'max_features': [2, 3, 4]},
]
forest_reg = RandomForestRegressor(random_state=42)
# train across 5 folds, that's a total of (12+6)*5=90 rounds of training 
grid_search = GridSearchCV(forest_reg, param_grid, cv=5,
                       scoring='neg_mean_squared_error')
grid_search.fit(housing_prepared, housing_labels)

在这里，我没有得到n_estimator和max_feature的概念。它是否像n_estimator表示来自数据和max_features的记录数是指从数据中选择的属性数量？

进一步走后，我得到了这个结果：

>> grid_search.best_params_
{'max_feature':8, 'n_estimator':30}

所以我没有得到这个结果要说的实际内容。

阅读了Randomforest回归剂的文档后，您可以看到n_estimators是森林中要使用的树木数。由于随机森林是一种包含创建多个决策树的合奏方法，因此该参数用于控制该过程中要使用的树的数量。

另一方面，

max_features在寻找拆分时确定要考虑的最大功能数量。有关max_features的更多信息，请阅读此答案。

n_estimators：这是树的数量（总的来说您要在获得最大投票或预测平均值之前要构建。越来越多的树木可为您提供更好的性能，但使您的代码较慢。

max_features：寻找最佳拆分时要考虑的功能数量。

>> grid_search.best_params_ :- {'max_feature':8, 'n_estimator':30}

这意味着它们是最好的超参数，您应该在 n_estimators之间运行模型 {3,10,30}或 max_features {2、4、6、6、8}

相关内容

最新更新

热门标签：