错误:__init__() 'n_splits'出现意外的关键字参数



我将为加州住房数据集执行ShuffleSplit()方法(来源:https://www.dcc.fc.up.pt/~ltorgo/Regression/cal_housing.html(来拟合SGD回归
但是,在应用方法时会出现"n_splits"错误
代码如下:

from sklearn import cross_validation, grid_search, linear_model, metrics  
import numpy as np  
import pandas as pd

from sklearn.preprocessing import scale
from sklearn.cross_validation import ShuffleSplit

housing_data = pd.read_csv('cal_housing.csv', header = 0, sep = ',')
housing_data.fillna(housing_data.mean(), inplace=True)
df=pd.get_dummies(housing_data)

y_target = housing_data['median_house_value'].values
x_features = housing_data.drop(['median_house_value'], axis = 1)
from sklearn.cross_validation import train_test_split
from sklearn import model_selection
train_x, test_x, train_y, test_y = model_selection.train_test_split(x_features, y_target, test_size=0.2, random_state=4)
reg = linear_model.SGDRegressor(random_state=0)
cv = ShuffleSplit(n_splits = 10, test_size = 0.2, random_state = 0)

错误如下:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-22-8f8760b04f8c> in <module>()
----> 1 cv = ShuffleSplit(n_splits = 10, test_size = 0.2, random_state = 0)
TypeError: __init__() got an unexpected keyword argument 'n_splits'

我用0.18版本更新了scikit learn。

Anaconda版本:4.5.8

你能就这个问题提出建议吗?

您将混合使用两个不同的模块。

在0.18之前,交叉验证用于ShuffleSplit。其中,n_splits不存在。n用于定义的拆分次数

但是,由于您现在已经更新到0.18,cross_validationgrid_search已经被弃用,取而代之的是model_selection。

这里的文档中提到了这一点,这些模块将从0.20版中删除

因此,取而代之的是:

from sklearn.cross_validation import ShuffleSplit
from sklearn.cross_validation import train_test_split

这样做:

from sklearn.model_selection import ShuffleSplit
fro

m sklearn.model_selection导入train_testrongplit

然后您可以使用n_splits

cv = ShuffleSplit(n_splits = 10, test_size = 0.2, random_state = 0)

相关内容

  • 没有找到相关文章

最新更新