小贝子编程

python CountVectorizer() vocabulary_ get 方法返回 None

本文关键字：get 方法返回 None vocabulary CountVectorizer python python scikit-learn nltk
更新时间 : 2023-08-29
英文 : python CountVectorizer() vocabulary_ get method returns None

我根据文档有这段代码http://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html

from sklearn.datasets import load_files
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
my_bunch = load_files("c:\temp\billing_test\")
my_data = my_bunch['data']
print (my_bunch.keys())
print('target_names',my_bunch['target_names'])
print('length of data' , len(my_bunch['data']))

X_train_counts = count_vect.fit_transform(my_data)
print(X_train_counts.shape)
print ( count_vect.vocabulary_.get(u'algorithm'))

输出如下

dict_keys(['target', 'filenames', 'target_names', 'data', 'DESCR'])
target_names ['false', 'true']
length of data 920
(920, 8773)
None

想知道为什么在（920,8773）之后向底部的"无"

我在每个文件夹"真"和"假"中有大约 460 个文本文档

谢谢

因为单词'algoritham'从未出现在您的文档中。

也许你应该试试'algorithm'.

python CountVectorizer() vocabulary_ get 方法返回 None

相关内容

最新更新

热门标签：