我正在coursera上一门课,为了通过考试,我需要提交最后一份作业。但是,我无法完成它。我遇到NotFittedError在第16行代码的。有没有人能帮我找出这段代码有什么问题?
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_curve, auc
def engagement_model():
train = pd.read_csv('assets/train.csv')
train_X = train[train.columns[1:9]]
train_y = train.iloc[:, 9:]
test = pd.read_csv('assets/test.csv')
X_train, X_test, y_train, y_test = train_test_split(train_X, train_y)
class_rf=RandomForestClassifier()
grid_values = {'n_estimators':[10,100], 'max_depth': [None, 30]}
grid_clf_auc = GridSearchCV(class_rf, param_grid=grid_values, scoring='roc_auc_score')
predict_test = grid_clf_auc.predict_proba(test[test.columns[1:9]])
predict_test = predict_test[:,1]
return pd.series(predict_test, index=[test['id']])
engagement_model()
我得到的错误是
NotFittedError: This GridSearchCV instance is not fitted yet. Call 'fit' with appropriate arguments before using this estimator.
你得到一个错误,因为你没有适合的模型。你首先需要将其拟合到你的列车数据上,然后你可以在测试数据上进行预测。在grid_clf_auc上使用.fit()方法
呼叫grid_clf_auc.predict_proba
前,需要先呼叫grid_clf_auc.fit(train_X, train_y)
。否则,您只是创建了一个GradSearchCV
对象,但您没有将其适合您的数据。不适合你的数据的分类器无法进行预测。