Python SKlearn TfidfVectorizer参数错误



我一直在使用SKlearn-TfidfVectorizer,但它突然抛出了一个错误:

TypeError: __init__() takes 1 positional argument but 2 positional arguments 
(and 4 keyword-only arguments) were given

我给出的论点是:

tfidf_vectorizer = TfidfVectorizer(X_train, ngram_range=(1,2), max_df=0.9, min_df=5, token_pattern=r'(S+)' )

其中X_train是字符串列表,例如:

'done earlier siesta',
'sunday mass us family greatful opportunity',
'wet wet wet frustrated outside',
'tired headache headache',
'friends creative talented inspired friendship love creatives',
'grateful lucky beaches sunshine hubby family pets awesome sunday',
'latest artwork',
'two headache sick tired sore'

我很困惑,当我只输入一个X_train列表时,为什么它会说我给出了两个位置参数。即使我将语句简化为:

TfidfVectorizer(X_train)

它仍然给出了同样的错误,说我给出了两个位置论点。我使用的是Sklearn 1.0.1,但我尝试将其恢复到1.0.0,但它仍然有相同的错误错误可能在我传递的列表中吗?

库及其实现确实发生了变化。如果我们查看版本0.23.1,我们会得到一个警告,指出它需要通过关键字args。

tfidvect=TfidfVectorizer(X_train)
FutureWarning: Pass input=['done earlier siesta', 'sunday mass us family greatful opportunity', 'wet wet wet frustrated outside', 'tired headache headache', 'friends creative talented inspired friendship love creatives', 'grateful lucky beaches sunshine hubby family pets awesome sunday', 'latest artwork', 'two headache sick tired sore'] as keyword args. From version 0.25 passing these as positional arguments will result in an error
warnings.warn("Pass {} as keyword args. From version 0.25 "

所以快进到1.0.1,同样的调用会像:

tfidvect1_01=TfidfVectorizer(input=X_train) # input positional argument

@Ambrayers补充道。

另一种方法是,创建对象,然后创建fit_transform,参考官方文档中的示例

vectorizer = TfidfVectorizer()  
X_train = vectorizer.fit_transform(X_train)

相关内容

最新更新