这是我的代码:
#Naive Bayes
from sklearn.naive_bayes import GaussianNB
clf = GaussianNB()
clf.fit(X_train, y_train)
prediction = clf.predict(X_test)
scores = cross_val_score(clf, X, y, cv=5)
print(accuracy_score(prediction, y_test))
这是我的错误:
ValueError Traceback ( most recent call last )
<ipython-input-46-6d6525f64959> in <module>()
2 from sklearn.naive_bayes import GaussianNB
3 clf = GaussianNB()
----> 4 clf.fit(X_train, y_train)
5 prediction = clf.predict(X_test)
6 scores = cross_val_score(clf, X, y, cv=5)
3 frames
/usr/local/lib/python3.7/dist-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
58 msg_err.format
59 (type_err,
---> 60 msg_dtype if msg_dtype is not None else X.dtype)
61 )
62 # for object dtype data, we only check for NaNs (GH-13254)
ValueError: Input contains NaN, infinity or a value too large for dtype('float64').
我正试图使用朴素贝叶斯方法来教机器,但我一直得到这个错误。
看起来您的数据具有NaN值。放下它们。使用此代码。
X_train = X_train.dropna()
#Naive Bayes
from sklearn.naive_bayes import GaussianNB
此代码将删除包含NaN值的行。如果要填充NaN而不是丢弃它们,可以使用.fillna(np.mean(column_name))
用它们列的平均值填充这些NaN。