GridSearchCV 中的回合数 xgboost

当我使用 GridSearchCV 和 xgboost 执行网格搜索时

kfold = StratifiedKFold(n_splits=3, shuffle=False, random_state=random_state)
model = xgb.XGBClassifier()
grid_search = GridSearchCV(model, param_grid, scoring="roc_auc",
        n_jobs=4, cv=kfold, verbose=1)

GridSearchCV内部使用的轮数是多少？

对此没有好的答案，但最好的策略是使用高数字 500/1000 甚至大数字以及early_stopping_rounds参数。CV 将继续，直到它开始在测试折叠上过度拟合。那时你会从CV中获得足够好的参数（从偏差-方差权衡的角度来看）。从本质上讲，虽然你可能设置了太多的提升步骤，但可能提升永远不会发生那么多回合。

使用 best_estimator_ 属性查找n_estimators参数：

grid_search.best_estimator_

输出：

XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
             colsample_bynode=1, colsample_bytree=1, eta=0.1, gamma=0,
             gpu_id=-1, importance_type='gain', interaction_constraints='',
             learning_rate=0.100000001, max_delta_step=0, max_depth=6,
             min_child_weight=1, missing=nan, monotone_constraints='()',
             n_estimators=100, n_jobs=4, num_parallel_tree=1, random_state=0,
             reg_alpha=0, reg_lambda=6, scale_pos_weight=1, subsample=0.8,
             tree_method='exact', validate_parameters=1, verbosity=None)

您可以使用它来更改网格搜索中的超参数值：

param_grid = [{'max_depth':[6], 
               'eta': [0.1], 
               'subsample': [0.8], 
               'reg_lambda': [6], 
               'n_estimators': [10,100,1000]}]
xgb_model = xgb.XGBRegressor()
grid_search = GridSearchCV(xgb_model, param_grid, cv=5, return_train_score=True, scoring = 'neg_mean_squared_error')
grid_search.fit(X_train, y_train)

相关内容

最新更新

热门标签：