我使用LinerSVC技术对文本进行分类,但我希望每个预测都附带一个预测置信度。
这就是我现在拥有的:
train_set = self.read_training_files()
count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform([e[0] for e in train_set])
tfidf_transformer = TfidfTransformer()
X_train_tfidf = tfidf_transformer.fit_transform(X_train_counts)
clf = LinearSVC(C=1).fit(X_train_tfidf, [e[1] for e in train_set])
_ = text_clf.fit([e[0] for e in train_set], [e[1] for e in train_set])
foods = list(self.get_foods())
lenfoods = len(foods)
i = 0
for food in foods:
fd = self.get_modified_food(food)
food_desc = fd['fields']['title'].replace(',', '').lower()
X_new_counts = count_vect.transform([food_desc])
X_new_tfidf = tfidf_transformer.transform(X_new_counts)
predicted = clf.predict(X_new_tfidf)
变量"预测"将包含预测类别编号,不包括置信度。我一直在阅读这里的源代码,但我没有找到合适的属性来完成这项工作。
我想你找错地方了:)。你看过吗:
相关决策功能?
就我个人而言,sklearn的文档非常有帮助;有时比代码更重要:)