scikit learn test_data_split:ValueError:发现样本数不一致的输入变量:[44999



这是我的代码

print(len(image_dataset.data))
print(len(phylum_target))
X_train, X_test, y_train, y_test = train_test_split(image_dataset.data, phylum_target, test_size=0.2,random_state=109)

这是输出和错误

5000
5000
Traceback (most recent call last):
File "Image_SVM_run_only.py", line 298, in <module>
X_train_temp, X_test_temp, y_train_temp, y_test_temp = train_test_split(image_dataset.data, phylum_target, test_size=0.2,random_state=109)
File "/root/anaconda3/envs/IBC/lib/python3.7/site-packages/sklearn/model_selection/_split.py", line 2127, in train_test_split
arrays = indexable(*arrays)
File "/root/anaconda3/envs/IBC/lib/python3.7/site-packages/sklearn/utils/validation.py", line 293, in indexable
check_consistent_length(*result)
File "/root/anaconda3/envs/IBC/lib/python3.7/site-packages/sklearn/utils/validation.py", line 257, in check_consistent_length
" samples: %r" % [int(l) for l in lengths])
ValueError: Found input variables with inconsistent numbers of samples: [4999, 5000]

尽管训练数据和测试数据的长度相同,但我还是犯了这个错误。请帮我T.T

这是我能从您的信息中辨别出的最小可复制示例,并且工作得很好

import numpy as np
from sklearn.model_selection import train_test_split
X = np.zeros((5000, 49152))
y = np.zeros((5000, 1))
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=109)
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)

最新更新