所以我用sklearn朴素贝叶斯分类器构建了一个模型。 我需要知道如何用输入预测句子
当我只是硬编码句子时,它的工作正常,看起来像这样
new_sentence = ['its so broken']
new_testdata_tfidf= tfidf.transform(new_sentence)
#transform it to matrix to see the score TFIDF on the training data
fit_feature_selection = selection.transform(new_testdata_tfidf)
#transform the new data to see if the feature remove or not, because after tfidf i use chi2 selection feature.
predicted = classifier.predict(feature_selection )
#then predict it. the classificaiton out, the class is -1 which is the correct answer
我需要用手输入文本数据作为输入,所以我像这样使用
new_sentence = input[('')]
#i input the same sentence its so broken
new_testdata_tfidf= tfidf.transform(new_sentence)
#transform it to matrix to see the score TFIDF on the training data
fit_feature_selection = selection.transform(new_testdata_tfidf)
#transform the new data to see if the feature remove or not, because after tfidf i use chi2 selection feature.
predicted = classifier.predict(feature_selection )
但它给了我输出
File "C:UsersMyfileOneDriveDesktopmodel.py", line 170, in <module>
new_testdata_tfidf= tfidf.transform(new_sentence)
File "E:anaconda3libsite-packagessklearnfeature_extractiontext.py", line 1898, in transform
X = super().transform(raw_documents)
File "E:anaconda3libsite-packagessklearnfeature_extractiontext.py", line 1265, in transform
"Iterable over raw text documents expected, "
ValueError: Iterable over raw text documents expected, string object received.
如何解决这个问题? 任何帮助真的非常感谢。
您是否尝试过将新句子作为数组传递?
即new_testdata_tfidf= tfidf.transform([new_sentence])
第一个实例是传递一个带有一个字符串元素的数组,另一个实例只是传递一个字符串
如果您尝试在代码中传递带有new_sentence = input[('')]
的字符串列表,则可能需要将其替换为
new_sentence = [input()]
希望这有帮助。