ValueError:发现样本数量不一致的输入变量:[1600,400]



这是代码:

from sklearn.model_selection import train_test_split
x_train,y_train,x_test,y_test=train_test_split(x,y,test_size=0.2,random_state=2020)
from sklearn.neighbors import KNeighborsClassifier
clf=KNeighborsClassifier(n_neighbors=5,metric='euclidean')
clf
clf.fit(x_train,y_train)

这是我得到的错误:

---------------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-52-91f676f4a5e0> in <module>()
1 
----> 2 clf.fit(x_train,y_train)
2 frames
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_consistent_length(*arrays)
210     if len(uniques) > 1:
211         raise ValueError("Found input variables with inconsistent numbers of"
--> 212                          " samples: %r" % [int(l) for l in lengths])
213 
214 
ValueError: Found input variables with inconsistent numbers of samples: [1600, 400]

我该如何解决这个问题?

您误解了train_test_split的工作原理,如果您正确阅读文档,您不会错过示例:

X_train, X_test, y_train, y_test = train_test_split(...)

代码:

x_train,y_train,x_test,y_test=train_test_split(x,y,test_size=0.2,random_state=2020)

需要这样修改:

x_train, x_test, y_train, y_test=train_test_split(x,y,test_size=0.2,random_state=2020)

剩下的代码将运行。

最新更新