Bellow,当我在应用程序中使用此代码时,代码会抛出一个奇怪的错误,如下所示。此错误是由于 SVC 分类器(https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html(的"random_state"参数引起的。
from sklearn.svm import SVC
import pandas as pd
from sklearn.metrics import *
from sklearn.model_selection import train_test_split
def Fit_Model(TrainData, Algo):
print Algo
df = pd.read_csv(TrainData, sep='t')
clm_list = df.columns.tolist()
X_train = df[clm_list[0:len(clm_list)-1]].values
y_train = df[clm_list[len(clm_list)-1]].values
X_train, X_test, y_train, y_test = train_test_split(X_train, y_train, test_size=0.2, random_state=0)
prob = Algo.fit(X_train, y_train).predict_proba(X_test)
predicted = Algo.fit(X_train, y_train).predict(X_test)
def SVM_classification(TrainFile, probability=None, randomRtate=None ):
pera = {"C":1.0,
"kernel":'rbf',
"degree":3,
"gamma":'scale',
"coef0":0.0,
"shrinking":True,
"probability":probability,
"tol":0.001,
"cache_size":200,
"class_weight":None,
"verbose":False,
"max_iter":-1,
"decision_function_shape":'ovr',
"random_state":randomRtate,
}
model = SVC(**pera )
Fit_Model(TrainData=TrainFile, Algo=model)
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file_name",
required=True,
default=None,
help="Path to target CSV file")
parser.add_argument("-p", "--proba",
required=None,
default=False,
help="n_folds for Cross Validation")
parser.add_argument("-r", "--Rand",
required=None,
default=False,
help="n_folds for Cross Validation")
args = parser.parse_args()
SVM_classification( args.file_name, args.proba, args.Rand )
当我尝试运行脚本时:
$ python Stack.py -f Resampled.tsv -p True -r 9
错误:
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
max_iter=-1, probability='True', random_state='9', shrinking=True,
tol=0.001, verbose=False)
Traceback (most recent call last):
File "Stack.py", line 68, in <module>
SVM_classification( args.file_name, args.proba, args.Rand )
File "Stack.py", line 42, in SVM_classification
Fit_Model(TrainData=TrainFile, Algo=model)
File "Stack.py", line 15, in Fit_Model
prob = Algo.fit(X_train, y_train).predict_proba(X_test)
File "/home/joshij/anaconda3/envs/Jay/lib/python2.7/site-packages/sklearn/svm/base.py", line 140, in fit
rnd = check_random_state(self.random_state)
File "/home/joshij/anaconda3/envs/Jay/lib/python2.7/site-packages/sklearn/utils/validation.py", line 818, in check_random_state
' instance' % seed)
ValueError: '9' cannot be used to seed a numpy.random.RandomState instance
然而,奇怪的是,我传递了一个整数,它一次又一次地显示相同的错误来测试我直接传递"int(8("的代码,但错误仍然是相同的。
示例数据"test.tsv":
col1 col2 col3 col4 class_label
3 4 5 3 0
2 3 3 4 0
2 3 3 5 0
2 3 3 4 0
2 3 2 4 0
2 3 3 3 1
1 2 3 2 1
1 5 6 9 1
1 2 2 2 1
1 2 2 2 1
请帮忙。
更新:
当我更改
"random_state":randomRtate,
如
"random_state":int(randomRtate),
现在不同的错误。
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
max_iter=-1, probability='True', random_state=9, shrinking=True,
tol=0.001, verbose=False)
Traceback (most recent call last):
File "Stack.py", line 69, in <module>
SVM_classification( args.file_name, args.proba, args.Rand )
File "Stack.py", line 43, in SVM_classification
Fit_Model(TrainData=TrainFile, Algo=model)
File "Stack.py", line 15, in Fit_Model
prob = Algo.fit(X_train, y_train).predict_proba(X_test)
File "/home/joshij/anaconda3/envs/Jay/lib/python2.7/site-packages/sklearn/svm/base.py", line 212, in fit
fit(X, y, sample_weight, solver_type, kernel, random_seed=seed)
File "/home/joshij/anaconda3/envs/Jay/lib/python2.7/site-packages/sklearn/svm/base.py", line 271, in _dense_fit
max_iter=self.max_iter, random_seed=random_seed)
File "sklearn/svm/libsvm.pyx", line 64, in sklearn.svm.libsvm.fit
TypeError: an integer is required
请检查所有参数类型。 您已将class_weight
设置为None
,这可能会覆盖所有1
的默认大小写。 您还probability
设置为字符串,其中需要布尔值。 如果基础实现将 bool 转换为 int,则可能会收到 int 类型错误。
还有其他几个参数可能也需要是整数。值得注意的例子包括max_iter
、degree
和cache_size
。尝试强制所有这些也是整数。