调整XG Boost参数时出现问题



我有一个分类问题,我选择使用xg-boost,它给了我很好的准确性,但在用随机搜索进行超参数调整时,我面临着一个问题!

我制作的网格:

xg_grid = {"n_estimators":np.arange(1,10),
"max_depth": [None,6,8,10],
"learning_rate":[0.1,0.5,0.8,1],
"objective":"reg:logistic",
"max-depth": np.arange(6,10),
"alpha":np.arange(0,5),
"colsample_bytree":[0.1,1,0.5,0.3],
"booster":["gbtree","gblinear","dart"]}

安装:


model =xgb.XGBClassifier(random_state=123)
rs_xg_boost = RandomizedSearchCV(model,
param_distributions=xg_grid,
cv=5,
n_iter = 10,
n_jobs = -1,
verbose=3)
rs_xg_boost.fit(x_train,y_train)

错误:

Fitting 5 folds for each of 10 candidates, totalling 50 fits
---------------------------------------------------------------------------
XGBoostError                              Traceback (most recent call last)
<ipython-input-98-f3a91cc91b88> in <module>
6 n_jobs = -1,
7 verbose=3)
----> 8 rs_xg_boost.fit(x_train,y_train)
~Desktopclassifier_algorithmenvlibsite-packagessklearnutilsvalidation.py in inner_f(*args, **kwargs)
61             extra_args = len(args) - len(all_args)
62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
64 
65             # extra_args > 0
~Desktopclassifier_algorithmenvlibsite-packagessklearnmodel_selection_search.py in fit(self, X, y, groups, **fit_params)
878             refit_start_time = time.time()
879             if y is not None:
--> 880                 self.best_estimator_.fit(X, y, **fit_params)
881             else:
882                 self.best_estimator_.fit(X, **fit_params)
~Desktopclassifier_algorithmenvlibsite-packagesxgboostcore.py in inner_f(*args, **kwargs)
420         for k, arg in zip(sig.parameters, args):
421             kwargs[k] = arg
--> 422         return f(**kwargs)
423 
424     return inner_f
~Desktopclassifier_algorithmenvlibsite-packagesxgboostsklearn.py in fit(self, X, y, sample_weight, base_margin, eval_set, eval_metric, early_stopping_rounds, verbose, xgb_model, sample_weight_eval_set, feature_weights, callbacks)
907             eval_group=None, label_transform=label_transform)
908 
--> 909         self._Booster = train(xgb_options, train_dmatrix,
910                               self.get_num_boosting_rounds(),
911                               evals=evals,
~Desktopclassifier_algorithmenvlibsite-packagesxgboosttraining.py in train(params, dtrain, num_boost_round, evals, obj, feval, maximize, early_stopping_rounds, evals_result, verbose_eval, xgb_model, callbacks)
225     Booster : a trained booster model
226     """
--> 227     bst = _train_internal(params, dtrain,
228                           num_boost_round=num_boost_round,
229                           evals=evals,
~Desktopclassifier_algorithmenvlibsite-packagesxgboosttraining.py in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks, evals_result, maximize, verbose_eval, early_stopping_rounds)
100         # Skip the first update if it is a recovery step.
101         if version % 2 == 0:
--> 102             bst.update(dtrain, i, obj)
103             bst.save_rabit_checkpoint()
104             version += 1
~Desktopclassifier_algorithmenvlibsite-packagesxgboostcore.py in update(self, dtrain, iteration, fobj)
1278 
1279         if fobj is None:
-> 1280             _check_call(_LIB.XGBoosterUpdateOneIter(self.handle,
1281                                                     ctypes.c_int(iteration),
1282                                                     dtrain.handle))
~Desktopclassifier_algorithmenvlibsite-packagesxgboostcore.py in _check_call(ret)
187     """
188     if ret != 0:
--> 189         raise XGBoostError(py_str(_LIB.XGBGetLastError()))
190 
191 
XGBoostError: [17:32:13] ..srcobjectiveobjective.cc:26: Unknown objective function: `i`
Objective candidate: survival:aft
Objective candidate: binary:hinge
Objective candidate: multi:softmax
Objective candidate: multi:softprob
Objective candidate: rank:pairwise
Objective candidate: rank:ndcg
Objective candidate: rank:map
Objective candidate: reg:squarederror
Objective candidate: reg:squaredlogerror
Objective candidate: reg:logistic
Objective candidate: reg:pseudohubererror
Objective candidate: binary:logistic
Objective candidate: binary:logitraw
Objective candidate: reg:linear
Objective candidate: count:poisson
Objective candidate: survival:cox
Objective candidate: reg:gamma
Objective candidate: reg:tweedie

我不明白怎么了?我有我的网格,我适合它,为什么它不起作用?!

当为RandomizedSearchCV提供网格作为dict时,每个值都应该是要从中采样的参数列表,即使它只是一个。但是,您的网格包含以下键值对:

"objective": "reg:logistic"

这导致RandomizedSearchCV对字符串"reg:logistic"中的一个字符进行采样,而不是选择整个字符串。正确的方法是提供

"objective": ["reg:logistic"]

最新更新