我有一个分类问题,我选择使用xg-boost,它给了我很好的准确性,但在用随机搜索进行超参数调整时,我面临着一个问题!
我制作的网格:
xg_grid = {"n_estimators":np.arange(1,10),
"max_depth": [None,6,8,10],
"learning_rate":[0.1,0.5,0.8,1],
"objective":"reg:logistic",
"max-depth": np.arange(6,10),
"alpha":np.arange(0,5),
"colsample_bytree":[0.1,1,0.5,0.3],
"booster":["gbtree","gblinear","dart"]}
安装:
model =xgb.XGBClassifier(random_state=123)
rs_xg_boost = RandomizedSearchCV(model,
param_distributions=xg_grid,
cv=5,
n_iter = 10,
n_jobs = -1,
verbose=3)
rs_xg_boost.fit(x_train,y_train)
错误:
Fitting 5 folds for each of 10 candidates, totalling 50 fits
---------------------------------------------------------------------------
XGBoostError Traceback (most recent call last)
<ipython-input-98-f3a91cc91b88> in <module>
6 n_jobs = -1,
7 verbose=3)
----> 8 rs_xg_boost.fit(x_train,y_train)
~Desktopclassifier_algorithmenvlibsite-packagessklearnutilsvalidation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0
~Desktopclassifier_algorithmenvlibsite-packagessklearnmodel_selection_search.py in fit(self, X, y, groups, **fit_params)
878 refit_start_time = time.time()
879 if y is not None:
--> 880 self.best_estimator_.fit(X, y, **fit_params)
881 else:
882 self.best_estimator_.fit(X, **fit_params)
~Desktopclassifier_algorithmenvlibsite-packagesxgboostcore.py in inner_f(*args, **kwargs)
420 for k, arg in zip(sig.parameters, args):
421 kwargs[k] = arg
--> 422 return f(**kwargs)
423
424 return inner_f
~Desktopclassifier_algorithmenvlibsite-packagesxgboostsklearn.py in fit(self, X, y, sample_weight, base_margin, eval_set, eval_metric, early_stopping_rounds, verbose, xgb_model, sample_weight_eval_set, feature_weights, callbacks)
907 eval_group=None, label_transform=label_transform)
908
--> 909 self._Booster = train(xgb_options, train_dmatrix,
910 self.get_num_boosting_rounds(),
911 evals=evals,
~Desktopclassifier_algorithmenvlibsite-packagesxgboosttraining.py in train(params, dtrain, num_boost_round, evals, obj, feval, maximize, early_stopping_rounds, evals_result, verbose_eval, xgb_model, callbacks)
225 Booster : a trained booster model
226 """
--> 227 bst = _train_internal(params, dtrain,
228 num_boost_round=num_boost_round,
229 evals=evals,
~Desktopclassifier_algorithmenvlibsite-packagesxgboosttraining.py in _train_internal(params, dtrain, num_boost_round, evals, obj, feval, xgb_model, callbacks, evals_result, maximize, verbose_eval, early_stopping_rounds)
100 # Skip the first update if it is a recovery step.
101 if version % 2 == 0:
--> 102 bst.update(dtrain, i, obj)
103 bst.save_rabit_checkpoint()
104 version += 1
~Desktopclassifier_algorithmenvlibsite-packagesxgboostcore.py in update(self, dtrain, iteration, fobj)
1278
1279 if fobj is None:
-> 1280 _check_call(_LIB.XGBoosterUpdateOneIter(self.handle,
1281 ctypes.c_int(iteration),
1282 dtrain.handle))
~Desktopclassifier_algorithmenvlibsite-packagesxgboostcore.py in _check_call(ret)
187 """
188 if ret != 0:
--> 189 raise XGBoostError(py_str(_LIB.XGBGetLastError()))
190
191
XGBoostError: [17:32:13] ..srcobjectiveobjective.cc:26: Unknown objective function: `i`
Objective candidate: survival:aft
Objective candidate: binary:hinge
Objective candidate: multi:softmax
Objective candidate: multi:softprob
Objective candidate: rank:pairwise
Objective candidate: rank:ndcg
Objective candidate: rank:map
Objective candidate: reg:squarederror
Objective candidate: reg:squaredlogerror
Objective candidate: reg:logistic
Objective candidate: reg:pseudohubererror
Objective candidate: binary:logistic
Objective candidate: binary:logitraw
Objective candidate: reg:linear
Objective candidate: count:poisson
Objective candidate: survival:cox
Objective candidate: reg:gamma
Objective candidate: reg:tweedie
我不明白怎么了?我有我的网格,我适合它,为什么它不起作用?!
当为RandomizedSearchCV
提供网格作为dict
时,每个值都应该是要从中采样的参数列表,即使它只是一个。但是,您的网格包含以下键值对:
"objective": "reg:logistic"
这导致RandomizedSearchCV
对字符串"reg:logistic"
中的一个字符进行采样,而不是选择整个字符串。正确的方法是提供
"objective": ["reg:logistic"]