"助推器"对象没有属性"分数" - 准确性



在寻找xgboost的最佳参数时,我遇到了一个问题。

整个过程进展顺利,我设法将参数附加到模型并检查其准确性,但我的解决方案非常原始且不太好( "手动"将参数附加到先前创建的模型(

当我尝试检查模型的准确性时,我收到以下错误:

AttributeError: 'Booster' object has no attribute 'score'

准确性:

accuracy = classifier.score(X_test, y_test)
print(accuracy*100,'%')

我把所有代码都放在下面(都是因为我不知道错误发生的确切位置(:

# Fitting XGBoost to the Training set
from xgboost import XGBClassifier
classifier = XGBClassifier()
# Predicting the Test set results
y_pred = classifier.predict(X_test)
# here the accuracy is checked without any problem
accuracy = classifier.score(X_test, y_test)
print(accuracy*100,'%')
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
params = {
# Parameters that we are going to tune.
'max_depth':6,
'min_child_weight': 1,
'eta':.3,
'lambda': .1,
'subsample': 1,
'colsample_bytree': 1,
# Other parameters
'objective':'reg:squarederror',
}
params['eval_metric'] = "rmse"
num_boost_round = 999
model = xgb.train(
params,
dtrain,
num_boost_round=num_boost_round,
evals=[(dtest, "Test")],
early_stopping_rounds=10
)
print("Best RMSE: {:.2f} with {} rounds".format(
model.best_score,
model.best_iteration+1))
cv_results = xgb.cv ( 
params, 
dtrain, 
num_boost_round = num_boost_round, 
seed = 42, 
nfold = 5, 
metrics = {'rmse'}, 
early_stopping_rounds = 10 
) 
cv_results
cv_results ['test-rmse-mean']. min () 
gridsearch_params = [
(max_depth, min_child_weight)
for max_depth in range(9,12)
for min_child_weight in range(5,8)
]
min_rmse = float("Inf")
best_params = None
for max_depth, min_child_weight in gridsearch_params:
print("CV with max_depth={}, min_child_weight={}".format(
max_depth,
min_child_weight))
# Update our parameters
params['max_depth'] = max_depth
params['min_child_weight'] = min_child_weight
# Run CV
cv_results = xgb.cv(
params,
dtrain,
num_boost_round=num_boost_round,
seed=42,
nfold=5,
metrics={'rmse'},
early_stopping_rounds=10
)
# Update best RMSE
mean_rmse = cv_results['test-rmse-mean'].min()
boost_rounds = cv_results['test-rmse-mean'].argmin()
print("RMSE {} for {} rounds".format(mean_rmse, boost_rounds))
if mean_rmse < min_rmse:
min_rmse = mean_rmse
best_params = (max_depth,min_child_weight)
print("Best params: {}, {}, RMSE: {}".format(best_params[0], best_params[1], min_rmse))
params['max_depth'] = 9
params['min_child_weight'] = 7
gridsearch_params = [
(subsample, colsample)
for subsample in [i/10. for i in range(7,11)]
for colsample in [i/10. for i in range(7,11)]
]
min_rmse = float("Inf")
best_params = None
# We start by the largest values and go down to the smallest
for subsample, colsample in reversed(gridsearch_params):
print("CV with subsample={}, colsample={}".format(
subsample,
colsample))
# We update our parameters
params['subsample'] = subsample
params['colsample_bytree'] = colsample
# Run CV
cv_results = xgb.cv(
params,
dtrain,
num_boost_round=num_boost_round,
seed=42,
nfold=5,
metrics={'rmse'},
early_stopping_rounds=10
)
# Update best score
mean_rmse = cv_results['test-rmse-mean'].min()
boost_rounds = cv_results['test-rmse-mean'].argmin()
print("tRMSE {} for {} rounds".format(mean_rmse, boost_rounds))
if mean_rmse < min_rmse:
min_rmse = mean_rmse
best_params = (subsample,colsample)
print("Best params: {}, {}, RMSE: {}".format(best_params[0], best_params[1], min_rmse))
params['subsample'] = 1.0
params['colsample_bytree'] = 1.0
%time
# This can take some time…
min_rmse = float("Inf")
best_params = None
for eta in [.3, .2, .1, .05, .01, .005]:
print("CV with eta={}".format(eta))
# We update our parameters
params['eta'] = eta
# Run and time CV
%time cv_results = xgb.cv(
params,
dtrain,
num_boost_round=num_boost_round,
seed=42,
nfold=5,
metrics=['rmse'],
early_stopping_rounds=10
)
# Update best score
mean_rmse = cv_results['test-rmse-mean'].min()
boost_rounds = cv_results['test-rmse-mean'].argmin()
print("tRMSE {} for {} roundsn".format(mean_rmse, boost_rounds))
if mean_rmse < min_rmse:
min_rmse = mean_rmse
best_params = eta
print("Best params: {}, RMSE: {}".format(best_params, min_rmse))
params['eta'] = .2
classifier = xgb.train(
params,
dtrain,
num_boost_round=num_boost_round,
evals=[(dtest, "Test")],
early_stopping_rounds=10
)
num_boost_round = model.best_iteration + 1
best_model = xgb.train(
params,
dtrain,
num_boost_round=num_boost_round,
evals=[(dtest, "Test")]
)
from sklearn.metrics import mean_absolute_error
mean_absolute_error(best_model.predict(dtest), y_test)
best_model.save_model("my_model.model")
loaded_model = xgb.Booster()
loaded_model.load_model("my_model.model")
accuracy = classifier.score(X_test, y_test)
print(accuracy*100,'%')

第二次尝试检查准确性时,出现错误。

您的classifier对象是不包含方法score的对象类型Booster

您可以使用predict方法获取预测并通过sklearn.metrics计算您的分数

最新更新