假设我正在对我的一个模型进行超参数调优,假设我使用AdaBoostClassifier()并希望传递不同的base_estimator,因此我传递SVC &DecisionTreeClassifier作为估计量
_parameters=[
{
'mdl':[AdaBoostClassifier(random_state=23)],
'mdl__learning_rate':np.linspace(0,1,20),
'mdl__base_estimator':[SVC(),DecisionTreeClassifier()]
}
]
现在,我想把不同的值传递给decisiontreecclassifier的ccp_alpha,就像这样
'mdl__base_estimator':[LinearRegression(),DecisionTreeClassifier(ccp_alpha=[0.1,0.2,0.3,0.4])]
我怎么能做到这一点,我试着像这样传递它,但它不工作,这是我的整个代码
pipeline=Pipeline(
[
('scal',StandardScaler()),
('mdl','passthrough')
]
)
_parameters=[
{
'mdl':[DecisionTreeClassifier(random_state=42)] ,
'mdl__max_depth':np.linspace(2,30,2),
'mdl__min_samples_split':np.linspace(1,10,1),
'mdl__max_features':np.linspace(1,100,1),
'mdl__ccp_alpha':np.linspace(0,1,10)
}
,{
'mdl':[AdaBoostClassifier(random_state=23)],
'mdl__learning_rate':np.linspace(0,1,20),
'mdl__base_estimator':[SVC(),DecisionTreeClassifier(ccp_alpha=[0.3,0.4,0.5,0.7])]
}
]
grid_search=GridSearchCV(_pipeline,_parameters,cv=3,n_jobs=-1,scoring='f1')
grid_search.fit(x,y
)
这种分割就是为什么param_grid
可以是一个字典列表,就像你的外部分割一样;但是它不能很容易地处理嵌套析取。我想到了两种方法。
更多不相交的网格:
_parameters=[
{
'mdl': [DecisionTreeClassifier(random_state=42)],
'mdl__max_depth': np.linspace(2,30,2),
'mdl__min_samples_split': np.linspace(1,10,1),
'mdl__max_features': np.linspace(1,100,1),
'mdl__ccp_alpha': np.linspace(0,1,10),
},
{
'mdl': [AdaBoostClassifier(random_state=23)],
'mdl__learning_rate': np.linspace(0,1,20),
'mdl__base_estimator': [SVC()],
},
{
'mdl': [AdaBoostClassifier(random_state=23)],
'mdl__learning_rate': np.linspace(0,1,20),
'mdl__base_estimator': [DecisionTreeClassifier()],
'mdl__base_estimator__ccp_alpha': [0.3,0.4,0.5,0.7],
},
]
或列表推导式:
_parameters=[
{
'mdl': [DecisionTreeClassifier(random_state=42)],
'mdl__max_depth': np.linspace(2,30,2),
'mdl__min_samples_split': np.linspace(1,10,1),
'mdl__max_features': np.linspace(1,100,1),
'mdl__ccp_alpha': np.linspace(0,1,10),
},
{
'mdl': [AdaBoostClassifier(random_state=23)],
'mdl__learning_rate': np.linspace(0,1,20),
'mdl__base_estimator': [SVC()] + [DecisionTreeClassifier(ccp_alpha=a) for a in [0.3,0.4,0.5,0.7]],
},
]