Understanding sklearn grid_search

我很难理解grid_search类是如何工作的。我想找到能和RandomForestClassifier一起使用的最好的max_depth参数。我指定了我希望搜索通过的可能选项，我希望模块输出"最适合"的max_depth选项。

from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn import grid_search
iris= load_iris()    
forest_parameters = {'max_depth': [1,2,3,4]}
forest = RandomForestClassifier()
explorer = grid_search.GridSearchCV(forest, forest_parameters)
explorer.fit(iris['data'], iris['target'])

我期望我的explorer网格搜索模块返回最佳的max_depth参数，给定一组可能的选项[1,2,3,4]。为什么None的默认值仍在使用？如何使用grid_search来查找"最佳拟合"参数？

Out[13]: 
GridSearchCV(cv=None, error_score='raise',
       estimator=RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
       ---> max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False),
       fit_params={}, iid=True, loss_func=None, n_jobs=1,
       param_grid={'max_depth': [1, 2, 3, 4]}, pre_dispatch='2*n_jobs',
       refit=True, score_func=None, scoring=None, verbose=0)

这些只是调用网格搜索时使用的参数。要确定最佳参数，请使用explorer.best_params_，或者您可以使用explorer.best_estimator_找到估计器，前提是启用了refit。

相关内容

最新更新

热门标签：