我正在尝试将单个特征向量(即X_train[I])存储到数组X中以及其在另一个阵列Y中的对应标签。当我尝试拟合这两个数组时,我会得到错误ValueError:设置具有序列的数组元素如何修复此错误。提前谢谢。
from sklearn.datasets import load_svmlight_file
pathToTrainData="/Users/rkasat/Documents/final year project/scripts/Drydata/leaf/train_backup.txt"
X_train,Y_train= load_svmlight_file(pathToTrainData);
X= []
y=[]
for i in range(5):
X.append(X_train[i])
y.append(Y_train[i])
print(type(X[0]),type(y[0]))
from sklearn import svm
clf = svm.SVC(kernel='linear')
clf.fit(X,y)
output:
--------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-41-cd4b481af30a> in <module>()
8 from sklearn import svm
9 clf = svm.SVC(kernel='linear')
---> 10 clf.fit(X,y)
/Users/rkasat/anaconda/lib/python2.7/site-packages/sklearn/svm/base.pyc in fit(self, X, y, sample_weight)
137 "by not using the ``sparse`` parameter")
138
--> 139 X = atleast2d_or_csr(X, dtype=np.float64, order='C')
140 y = self._validate_targets(y)
141
/Users/rkasat/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.pyc in atleast2d_or_csr(X, dtype, order, copy, force_all_finite)
132 """
133 return _atleast2d_or_sparse(X, dtype, order, copy, sparse.csr_matrix,
--> 134 "tocsr", force_all_finite)
135
136
/Users/rkasat/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.pyc in _atleast2d_or_sparse(X, dtype, order, copy, sparse_class, convmethod, force_all_finite)
109 else:
110 X = array2d(X, dtype=dtype, order=order, copy=copy,
--> 111 force_all_finite=force_all_finite)
112 if force_all_finite:
113 _assert_all_finite(X)
/Users/rkasat/anaconda/lib/python2.7/site-packages/sklearn/utils/validation.pyc in array2d(X, dtype, order, copy, force_all_finite)
89 raise TypeError('A sparse matrix was passed, but dense data '
90 'is required. Use X.toarray() to convert to dense.')
---> 91 X_2d = np.asarray(np.atleast_2d(X), dtype=dtype, order=order)
92 if force_all_finite:
93 _assert_all_finite(X_2d)
/Users/rkasat/anaconda/lib/python2.7/site-packages/numpy/core/numeric.pyc in asarray(a, dtype, order)
318
319 """
--> 320 return array(a, dtype, copy=False, order=order)
321
322 def asanyarray(a, dtype=None, order=None):
ValueError: setting an array element with a sequence.
(<class 'scipy.sparse.csr.csr_matrix'>, <type 'numpy.float64'>)
您可能不必在代码中使用for循环。以下代码可能会执行您想要执行的操作:
X_train, Y_train = load_svmlight_file(pathToTrainData);
from sklearn import svm
clf = svm.SVC(kernel='linear')
clf.fit(X[:5, :],y[:5])
@tanemaki是对的,但值得解释一下为什么这样可以解决问题。CCD_ 1(很可能)是CCD_ 2阵列。用整数(X_train[i]
)对其进行切片将返回整个i
行。X
最终成为numpy数组的列表。fit
方法需要一个矩阵。如果你只想在前5行上训练,你应该像@tanemaki已经演示的那样切片:X[:5, :]
和y[:5, :]