这是代码:
from sklearn.model_selection import train_test_split
x_train,y_train,x_test,y_test=train_test_split(x,y,test_size=0.2,random_state=2020)
from sklearn.neighbors import KNeighborsClassifier
clf=KNeighborsClassifier(n_neighbors=5,metric='euclidean')
clf
clf.fit(x_train,y_train)
这是我得到的错误:
---------------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-52-91f676f4a5e0> in <module>()
1
----> 2 clf.fit(x_train,y_train)
2 frames
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_consistent_length(*arrays)
210 if len(uniques) > 1:
211 raise ValueError("Found input variables with inconsistent numbers of"
--> 212 " samples: %r" % [int(l) for l in lengths])
213
214
ValueError: Found input variables with inconsistent numbers of samples: [1600, 400]
我该如何解决这个问题?
您误解了train_test_split
的工作原理,如果您正确阅读文档,您不会错过示例:
X_train, X_test, y_train, y_test = train_test_split(...)
代码:
x_train,y_train,x_test,y_test=train_test_split(x,y,test_size=0.2,random_state=2020)
需要这样修改:
x_train, x_test, y_train, y_test=train_test_split(x,y,test_size=0.2,random_state=2020)
剩下的代码将运行。