值错误:发现样本数不一致的数组 [1,299]



这里是数据和这里。您可以通过单击链接链接来下载它。我正在使用Pandas,Numpy和Python3。

这是我的代码:

import pandas as pa
import numpy as nu
from sklearn.linear_model import Perceptron
from sklearn.metrics import accuracy_score
from sklearn.preprocessing import StandardScaler
def get_accuracy(X_train, y_train, X_test, y_test):
    perceptron = Perceptron()
    perceptron.fit(X_train, y_train)
    perceptron.transform(X_train)
    prediction = perceptron.predict(X_test)
    result = accuracy_score(y_test, prediction)
    return result
test_data = pa.read_csv("C:/Users/Roman/Downloads/perceptron-test.csv")
test_data.columns = ["class", "f1", "f2"]
train_data = pa.read_csv("C:/Users/Roman/Downloads/perceptron-train.csv")
train_data.columns = ["class", "f1", "f2"]
scaler = StandardScaler()
scaler.fit_transform(train_data[train_data.columns[1:]]).reshape(-1,1)
X_train = scaler.transform(train_data[train_data.columns[1:]])
scaler.fit_transform(train_data[train_data.columns[0]])
y_train = scaler.transform(train_data[train_data.columns[0]])
scaler.fit_transform(test_data[test_data.columns[1:]])
X_test = scaler.transform(test_data[test_data.columns[1:]])
scaler.fit_transform(test_data[test_data.columns[0]])
y_test = scaler.transform(test_data[test_data.columns[0]])


scaled_accuracy = get_accuracy(nu.ravel(X_train), nu.ravel(y_train),    nu.ravel(X_test), nu.ravel(y_test))
print(scaled_accuracy)

这是我得到的错误:

Traceback (most recent call last):
  File "C:/Users/Roman/PycharmProjects/data_project-1/lecture_2_perceptron.py", line 33, in <module>
    scaled_accuracy = get_accuracy(nu.ravel(X_train), nu.ravel(y_train), nu.ravel(X_test), nu.ravel(y_test))
  File "C:/Users/Roman/PycharmProjects/data_project-1/lecture_2_perceptron.py", line 9, in get_accuracy
    perceptron.fit(X_train, y_train)
  File "C:UsersRomanAppDataRoamingPythonPython35site-packagessklearnlinear_modelstochastic_gradient.py", line 545, in fit
    sample_weight=sample_weight)
  File "C:UsersRomanAppDataRoamingPythonPython35site-packagessklearnlinear_modelstochastic_gradient.py", line 389, in _fit
    X, y = check_X_y(X, y, 'csr', dtype=np.float64, order="C")
  File "C:UsersRomanAppDataRoamingPythonPython35site-packagessklearnutilsvalidation.py", line 520, in check_X_y
    check_consistent_length(X, y)
  File "C:UsersRomanAppDataRoamingPythonPython35site-packagessklearnutilsvalidation.py", line 176, in check_consistent_length
    "%s" % str(uniques))
**ValueError: Found arrays with inconsistent numbers of samples: [  1 299]**

无需缩放数据,一切正常。但缩放后没有。

您不应该在每次使用缩放器时都调用fit_transform。您应该在训练数据上fit一次,然后只transform,否则您将获得不同的训练和测试表示形式(导致提供错误)。缩放标签也没有意义。

相关内容

  • 没有找到相关文章

最新更新