支持向量机中"ValueError: 'x' cannot be used to seed a numpy.random.RandomState instance"奇怪的错误?



Bellow,当我在应用程序中使用此代码时,代码会抛出一个奇怪的错误,如下所示。此错误是由于 SVC 分类器(https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html(的"random_state"参数引起的。

from sklearn.svm import SVC
import pandas as pd
from sklearn.metrics import *
from sklearn.model_selection import train_test_split
def Fit_Model(TrainData, Algo):
print Algo

df = pd.read_csv(TrainData, sep='t')
clm_list = df.columns.tolist()
X_train = df[clm_list[0:len(clm_list)-1]].values
y_train = df[clm_list[len(clm_list)-1]].values
X_train, X_test, y_train, y_test = train_test_split(X_train, y_train, test_size=0.2, random_state=0)
prob = Algo.fit(X_train, y_train).predict_proba(X_test)
predicted = Algo.fit(X_train, y_train).predict(X_test)

def SVM_classification(TrainFile, probability=None, randomRtate=None ):

pera = {"C":1.0, 
"kernel":'rbf', 
"degree":3, 
"gamma":'scale', 
"coef0":0.0, 
"shrinking":True,
"probability":probability, 
"tol":0.001, 
"cache_size":200, 
"class_weight":None, 
"verbose":False, 
"max_iter":-1, 
"decision_function_shape":'ovr',
"random_state":randomRtate,
}
model = SVC(**pera )

Fit_Model(TrainData=TrainFile,  Algo=model)

import argparse
parser = argparse.ArgumentParser()
parser.add_argument("-f", "--file_name",
required=True,
default=None,
help="Path to target CSV file")
parser.add_argument("-p", "--proba",
required=None,
default=False,
help="n_folds for Cross Validation")
parser.add_argument("-r", "--Rand",
required=None,
default=False,
help="n_folds for Cross Validation")

args = parser.parse_args()
SVM_classification( args.file_name, args.proba, args.Rand )

当我尝试运行脚本时:

$ python Stack.py -f Resampled.tsv -p True -r 9

错误:

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
max_iter=-1, probability='True', random_state='9', shrinking=True,
tol=0.001, verbose=False)
Traceback (most recent call last):
File "Stack.py", line 68, in <module>
SVM_classification( args.file_name, args.proba, args.Rand )
File "Stack.py", line 42, in SVM_classification
Fit_Model(TrainData=TrainFile,  Algo=model)
File "Stack.py", line 15, in Fit_Model
prob = Algo.fit(X_train, y_train).predict_proba(X_test)
File "/home/joshij/anaconda3/envs/Jay/lib/python2.7/site-packages/sklearn/svm/base.py", line 140, in fit
rnd = check_random_state(self.random_state)
File "/home/joshij/anaconda3/envs/Jay/lib/python2.7/site-packages/sklearn/utils/validation.py", line 818, in check_random_state
' instance' % seed)
ValueError: '9' cannot be used to seed a numpy.random.RandomState instance

然而,奇怪的是,我传递了一个整数,它一次又一次地显示相同的错误来测试我直接传递"int(8("的代码,但错误仍然是相同的。

示例数据"test.tsv":

col1    col2    col3    col4    class_label
3   4   5   3   0
2   3   3   4   0
2   3   3   5   0
2   3   3   4   0
2   3   2   4   0
2   3   3   3   1
1   2   3   2   1
1   5   6   9   1
1   2   2   2   1
1   2   2   2   1

请帮忙。

更新:

当我更改

"random_state":randomRtate,

"random_state":int(randomRtate),

现在不同的错误。

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape='ovr', degree=3, gamma='scale', kernel='rbf',
max_iter=-1, probability='True', random_state=9, shrinking=True,
tol=0.001, verbose=False)
Traceback (most recent call last):
File "Stack.py", line 69, in <module>
SVM_classification( args.file_name, args.proba, args.Rand )
File "Stack.py", line 43, in SVM_classification
Fit_Model(TrainData=TrainFile,  Algo=model)
File "Stack.py", line 15, in Fit_Model
prob = Algo.fit(X_train, y_train).predict_proba(X_test)
File "/home/joshij/anaconda3/envs/Jay/lib/python2.7/site-packages/sklearn/svm/base.py", line 212, in fit
fit(X, y, sample_weight, solver_type, kernel, random_seed=seed)
File "/home/joshij/anaconda3/envs/Jay/lib/python2.7/site-packages/sklearn/svm/base.py", line 271, in _dense_fit
max_iter=self.max_iter, random_seed=random_seed)
File "sklearn/svm/libsvm.pyx", line 64, in sklearn.svm.libsvm.fit
TypeError: an integer is required 

请检查所有参数类型。 您已将class_weight设置为None,这可能会覆盖所有1的默认大小写。 您还probability设置为字符串,其中需要布尔值。 如果基础实现将 bool 转换为 int,则可能会收到 int 类型错误。

还有其他几个参数可能也需要是整数。值得注意的例子包括max_iterdegreecache_size。尝试强制所有这些也是整数。

最新更新