如何在scikit-learn中使用KFold而不是StratifiedKFold进行RFECV?

from sklearn.cross_validation import StratifiedKFold, KFold
from sklearn.feature_selection import RFECV
rfecv = RFECV(estimator=LogisticRegression(), step=1, cv=StratifiedKFold(y, 10),
scoring='accuracy') 
rfecv.fit(X, y)

是使用 StratifiedKFold 进行 RFECV 的一个例子。问题是如何使用普通的 KFold 进行 RFECV？

cv=KFold(y, 10)不是答案，因为KFold和StratifiedKFold采用并返回完全不同的值。

KFold(len(y), n_folds = n_folds)

就是答案。所以，对于 10 倍，它会像

rfecv = RFECV(estimator=LogisticRegression(), step=1, cv=KFold(len(y),n_folds=10),
scoring='accuracy')

您可以手动创建自己的简历策略，模仿KFold所做的一切：

def createCV():
'''returns somthing like:
custom_cv = [([0, 1, 2 ,3, 4, 5, 6], [7]), 
([0, 1, 2, 3, 4, 5], [6]), 
([0, 1, 2, 3, 4], [5]),
([0, 1, 2, 3], [4]),
([0, 1, 2], [3])] 
where the 0th list element in each tuple is the training set, and the second is the test 
'''
manual_cv  = createCV()
rfecv = RFECV(estimator=LogisticRegression(), step=1, cv=manual_cv,
scoring='accuracy')

您甚至可以使用和重新排列KFold为您提供createCV的内容，以满足您的简历需求。

相关内容

最新更新

热门标签：