>我正在做多类分类 数据的形状是(299,6),标签的形状是(299,5)。这是我拥有的数据示例
[[0.004873972,0.069813839,-0.470500136,2.285885634,0.5335,0.052915143],
[0.001698812,0.041216647,-0.01333925,2.507806584,0.2332,0.123463255],
[0.005954432,0.077164967,4.749752766,26.45721079,0.1663,0.186452725],
[0.001792197,0.042334345,-0.176201652,1.9656153,0.4001,0.087055596],
[0.001966929,0.044350068,0.182059972,1.610369693,0.55,0.29675874]]
以下是存储在 csv 文件中的数据[[1,0,0,0,0],[0,0,0,1,0],[0,0,0,1,0],[0,0,1,0,0],[0,1,0,0,0]]
的数据标签的标签。
尝试了 svm 和逻辑回归,但给了我错误 ValueError:错误的输入形状 (299, 5),错误出在标签中,但我如何解决这个问题。
[sample dataset][1]
[1]: https://i.stack.imgur.com/Wncqy.png
您可以将其作为标准分类任务运行,将独热结尾转换为标签并训练 SVM 分类器,请参阅示例代码:
import numpy as np
from sklearn.svm import SVC
data = np.array([[0.004873972,0.069813839,-0.470500136,2.285885634,0.5335,0.052915143],
[0.001698812,0.041216647,-0.01333925,2.507806584,0.2332,0.123463255],
[0.005954432,0.077164967,4.749752766,26.45721079,0.1663,0.186452725],
[0.001792197,0.042334345,-0.176201652,1.9656153,0.4001,0.087055596],
[0.001966929,0.044350068,0.182059972,1.610369693,0.55,0.29675874]])
outputs = np.array([[1,0,0,0,0],[0,0,0,1,0],[0,0,0,1,0],[0,0,1,0,0],[0,1,0,0,0]])
labels = np.argmax(outputs, axis=0)
clf = SVC()
clf.fit(data, labels)
print(clf.score(data, labels))
# 0.6
有关参数调优,请查看 Python 中的随机森林超参数调优和比较随机搜索和网格搜索以进行超参数估计