不太确定这个错误的原因是什么:
RuntimeWarning: invalid value encountered in true_divide
msw = sswn / float(dfwn)
当与以下内容一起使用时:
import io
import pandas as pd
from sklearn import model_selection
from sklearn.feature_selection import SelectKBest
from sklearn.feature_selection import f_classif
df = pd.read_csv(
io.StringIO(
"x0,x1,yn10.354468012163927,7.655143584899129,168.06121374114608n8.786243147880384,6.244283164157256,156.570749155167n10.450548129254543,8.084427493431185,152.10261405911672n10.869778308219216,9.165630427431644,129.72126680171317n11.236593954599316,5.7987616455741575,55.294961794556315n9.111226379916955,10.289447419679227,308.7475968288771n9.753313270715008,9.803181441185592,163.337342478704n9.752270042969856,9.004988677803736,271.9442757290742n8.67161845864426,9.801711898528824,158.09622149503954n8.830913103331573,6.632544281651334,316.23912914041557n"
)
)
X_train, X_test, y_train, y_test = model_selection.train_test_split(
df.drop("y", axis=1),
df["y"],
test_size=0.2,
)
X_new = SelectKBest(f_classif, k=2).fit_transform(X_train, y_train)
您正在使用选择器进行分类,但正如我所看到的,您的问题是回归问题
x0 x1 y
0 10.354468 7.655144 168.061214
1 8.786243 6.244283 156.570749
2 10.450548 8.084427 152.102614
3 10.869778 9.165630 129.721267
4 11.236594 5.798762 55.294962
5 9.111226 10.289447 308.747597
6 9.753313 9.803181 163.337342
7 9.752270 9.004989 271.944276
8 8.671618 9.801712 158.096221
9 8.830913 6.632544 316.239129
标签y
是一个浮点值,而不是一个类。
尝试这两种而不是f_classif
from sklearn.feature_selection import f_regression
from sklearn.feature_selection import mutual_info_regression
X_new = SelectKBest(f_regression, k=2).fit_transform(X_train, y_train)
X_new = SelectKBest(mutual_info_regression, k=2).fit_transform(X_train, y_train)