Sklearn:使用SelectKBest(f_classit,..)时,在true_div中遇到无效值



不太确定这个错误的原因是什么:

RuntimeWarning: invalid value encountered in true_divide
msw = sswn / float(dfwn)

当与以下内容一起使用时:

import io
import pandas as pd
from sklearn import model_selection
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import f_classif
df = pd.read_csv(
io.StringIO(
"x0,x1,yn10.354468012163927,7.655143584899129,168.06121374114608n8.786243147880384,6.244283164157256,156.570749155167n10.450548129254543,8.084427493431185,152.10261405911672n10.869778308219216,9.165630427431644,129.72126680171317n11.236593954599316,5.7987616455741575,55.294961794556315n9.111226379916955,10.289447419679227,308.7475968288771n9.753313270715008,9.803181441185592,163.337342478704n9.752270042969856,9.004988677803736,271.9442757290742n8.67161845864426,9.801711898528824,158.09622149503954n8.830913103331573,6.632544281651334,316.23912914041557n"
)
)
X_train, X_test, y_train, y_test = model_selection.train_test_split(
df.drop("y", axis=1),
df["y"],
test_size=0.2,
)
X_new = SelectKBest(f_classif, k=2).fit_transform(X_train, y_train)

您正在使用选择器进行分类,但正如我所看到的,您的问题是回归问题

x0          x1          y
0   10.354468   7.655144    168.061214
1   8.786243    6.244283    156.570749
2   10.450548   8.084427    152.102614
3   10.869778   9.165630    129.721267
4   11.236594   5.798762    55.294962
5   9.111226    10.289447   308.747597
6   9.753313    9.803181    163.337342
7   9.752270    9.004989    271.944276
8   8.671618    9.801712    158.096221
9   8.830913    6.632544    316.239129

标签y是一个浮点值,而不是一个类。

尝试这两种而不是f_classif

from sklearn.feature_selection import f_regression
from sklearn.feature_selection import mutual_info_regression
X_new = SelectKBest(f_regression, k=2).fit_transform(X_train, y_train)
X_new = SelectKBest(mutual_info_regression, k=2).fit_transform(X_train, y_train)

相关内容

最新更新