我有一个代码来尝试使用非线性SVM(RBF内核)。
raw_data1 = open("/Users/prateek/Desktop/Programs/ML/Dataset.csv")
raw_data2 = open("/Users/prateek/Desktop/Programs/ML/Result.csv")
dataset1 = np.loadtxt(raw_data1,delimiter=",")
result1 = np.loadtxt(raw_data2,delimiter=",")
clf = svm.NuSVC(kernel='rbf')
clf.fit(dataset1,result1)
但是,当我尝试适合时,我收到错误
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/prateek/Desktop/Programs/ML/lib/python2.7/site-packages/sklearn/svm/base.py", line 193, in fit
fit(X, y, sample_weight, solver_type, kernel, random_seed=seed)
File "/Users/prateek/Desktop/Programs/ML/lib/python2.7/site-packages/sklearn/svm/base.py", line 251, in _dense_fit
max_iter=self.max_iter, random_seed=random_seed)
File "sklearn/svm/libsvm.pyx", line 187, in sklearn.svm.libsvm.fit (sklearn/svm/libsvm.c:2098)
ValueError: specified nu is infeasible
结果链接.csv
数据集链接
出现这种错误的原因是什么?
> 如文档中指出的,nu
参数是"训练误差分数的上限和支持向量分数的下限"。
因此,每当您尝试拟合数据并且无法满足此边界时,优化问题就变得不可行。因此你的错误。
事实上,我从 1.
循环到 0.1
(以十进制单位递减)但仍然遇到错误,然后只是尝试了0.01
,没有出现任何抱怨。但是,当然,您应该检查使用该值拟合模型的结果,检查预测的准确性是否可接受。
更新:实际上我很好奇,拆分了您的数据集进行验证,输出的准确率为 69%(我也认为您的训练集可能很少)
仅出于可重复性目的,在这里,我执行的快速测试:
from sklearn import svm
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score
raw_data1 = open("Dataset.csv")
raw_data2 = open("Result.csv")
dataset1 = np.loadtxt(raw_data1,delimiter=",")
result1 = np.loadtxt(raw_data2,delimiter=",")
clf = svm.NuSVC(kernel='rbf',nu=0.01)
X_train, X_test, y_train, y_test = train_test_split(dataset1,result1, test_size=0.25, random_state=42)
clf.fit(X_train,y_train)
y_pred = clf.predict(X_test)
accuracy_score(y_test, y_pred, normalize=True, sample_weight=None)