t我无法使用 KNeighbors Classifiers 使用 GridSearchCV。 我在发出grid.fit(dataImp,y)时收到以下错误:
TypeError: "init() 得到了一个意外的关键字参数 'p'"
该错误可以使用使用的任何数据重现。 引用的数据只是用于测试的虚拟数据。
要重现的代码如下:
from sklearn.grid_search import GridSearchCV
from sklearn import cross_validation
from sklearn import neighbors
import numpy as np
dataImpNew = np.transpose(np.atleast_2d(np.arange(20.)))*np.arange(20.)
yNew = np.sign(np.arange(-5.5,14))
nFolds = 4
random_state = 1234
metrics = ['minkowski','euclidean','manhattan']
weights = ['uniform','distance'] #10.0**np.arange(-5,4)
numNeighbors = np.arange(5,10)
param_grid = dict(metric=metrics,weights=weights,n_neighbors=numNeighbors)
cv = cross_validation.StratifiedKFold(yNew,nFolds)
grid = GridSearchCV(neighbors.KNeighborsClassifier(),param_grid=param_grid,cv=cv)
grid.fit(dataImpNew,yNew)
完整引用:
Traceback (most recent call last):
File "/home/pjvalla/testDir/test.py", line 25, in <module>
grid.fit(dataImpNew,yNew)
File "/usr/lib/python2.7/dist-packages/sklearn/grid_search.py", line 596, in fit
return self._fit(X, y, ParameterGrid(self.param_grid))
File "/usr/lib/python2.7/dist-packages/sklearn/grid_search.py", line 378, in _fit
for parameters in parameter_iterable
File "/usr/lib/python2.7/dist-packages/joblib/parallel.py", line 653, in __call__
self.dispatch(function, args, kwargs)
File "/usr/lib/python2.7/dist-packages/joblib/parallel.py", line 400, in dispatch
job = ImmediateApply(func, args, kwargs)
File "/usr/lib/python2.7/dist-packages/joblib/parallel.py", line 138, in __init__
self.results = func(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/sklearn/cross_validation.py", line 1239, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/usr/lib/python2.7/dist-packages/sklearn/neighbors/base.py", line 628, in fit
return self._fit(X)
File "/usr/lib/python2.7/dist-packages/sklearn/neighbors/base.py", line 217, in _fit
**self.effective_metric_kwds_)
File "binary_tree.pxi", line 1062, in sklearn.neighbors.kd_tree.BinaryTree.__init__ (sklearn/neighbors/kd_tree.c:8380)
File "dist_metrics.pyx", line 280, in sklearn.neighbors.dist_metrics.DistanceMetric.get_metric (sklearn/neighbors/dist_metrics.c:4066)
TypeError: __init__() got an unexpected keyword argument 'p'
对我有用,尽管我不得不重命名dataImpNew
和yNew
(删除"新"部分):
In [4]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
:from sklearn.grid_search import GridSearchCV
:from sklearn import cross_validation
:from sklearn import neighbors
:import numpy as np
:
:dataImp = np.transpose(np.atleast_2d(np.arange(20.)))*np.arange(20.)
:y = np.sign(np.arange(-5.5,14))
:nFolds = 4
:random_state = 1234
:metrics = ['minkowski','euclidean','manhattan']
:weights = ['uniform','distance'] #10.0**np.arange(-5,4)
:numNeighbors = np.arange(5,10)
:param_grid = dict(metric=metrics,weights=weights,n_neighbors=numNeighbors)
:cv = cross_validation.StratifiedKFold(y,nFolds)
:grid = GridSearchCV(neighbors.KNeighborsClassifier(),param_grid=param_grid,cv=cv)
:grid.fit(dataImp,y)
:
:<EOF>
Out[4]:
GridSearchCV(cv=sklearn.cross_validation.StratifiedKFold(labels=[-1. -1. -1. -1. -1. -1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.
1. 1.], n_folds=4, shuffle=False, random_state=None),
estimator=KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_neighbors=5, p=2, weights='uniform'),
fit_params={}, iid=True, loss_func=None, n_jobs=1,
param_grid={'n_neighbors': array([5, 6, 7, 8, 9]), 'metric': ['minkowski', 'euclidean', 'manhattan'], 'weights': ['uniform', 'distance']},
pre_dispatch='2*n_jobs', refit=True, score_func=None, scoring=None,
verbose=0)
您可以发布完整的堆栈跟踪吗?