我想利用Catboost执行RFECV:此处的示例代码:
from sklearn.model_selection import KFold
from sklearn.metrics import roc_auc_score
from sklearn.model_selection import StratifiedKFold
from sklearn.feature_selection import RFECV
from catboost import CatBoostClassifier
def scoring_func(cl, X, Y_true):
Y_pred = cl.predict_proba(X)[:,1]
return roc_auc_score(Y_true, Y_pred)
kf = StratifiedKFold(n_splits=5, shuffle=True, random_state=8888)
cl = CatBoostClassifier(
iterations=100,
random_seed=63,
learning_rate=0.05,
custom_loss='F1',
loss_function = 'Logloss',
class_weights = prop,
l2_leaf_reg = 4
)
selector = RFECV(estimator=cl, cv=kf.split(X_train, y_train), scoring=scoring_func, verbose=1)
selector = selector.fit(X_train, y_train)
这会引发以下错误:
AttributeError: 'CatBoostClassifier' object has no attribute '_get_tags'
我似乎找不到任何解决这一问题的文档。有什么解决方案吗?
解决方法是对进行子类化
class CatBoostRegressor(CatBoostRegressor):
def _get_tags(self):
return {'allow_nan': True}
但是_get_tags方法在这里被合并了。
然而,仍然存在问题的是,同样的设置,但有分类变量(出于其他原因(。