我正在尝试选择重要的功能(或至少了解哪些功能解释更多的可变性)。为此,我同时使用extratreesClassifier和渐变boostingRegressor-然后使用: -
clf = ExtraTreesClassifier(n_estimators=10,max_features='auto',random_state=0) # stops after 10 estimation passes, right ?
clf.fit(x_train, y_train)
feature_importance=clf.feature_importances_ # does NOT work - returns NoneType for feature_importance
发布此我真的很感兴趣在绘制它们(用于视觉表示)的绘制 - 甚至是初步的,只是查看重要性相对顺序和相应的索引
# Both of these do not work as the feature_importance is of NoneType
feature_importance = 100.0 * (feature_importance / feature_importance.max())
indices = numpy.argsort(feature_importance)[::-1]
我发现令人困惑的是 - 如果我要在下面使用渐变boostingRegressor,我确实会得到feature_importance及其索引。我在做什么错?
#Works with GradientBoostingRegressor
params = {'n_estimators': 100, 'max_depth': 3, 'learning_rate': 0.1, 'loss': 'lad'}
clf = GradientBoostingRegressor(**params).fit(x_train, y_train)
clf.fit(x_train, y_train)
feature_importance=clf.feature_importances_
其他信息:我有12个独立vars(x_train)和一个标签var(y_train)),具有多个值(例如4,5,7)和类型(x_train)和类型(features_importance))是
确认:从这篇文章中借一些元素http://www.tonicebrian.com/2012/11/11/05/training-gradient-boosting-boosting-boosting-with-with-python/python/py/py/py/p>
初始化extratreeClaleCifier时,有一个 compute_importances
defauble to None
。换句话说,您需要将ExtraTreeClassifier
初始化为
clf = ExtraTreesClassifier(n_estimators=10,max_features='auto',random_state=0,compute_importances=True)
,以便计算功能重要性。
对于渐变bloostedRegressor而言,将始终计算这种选择和特征重要性。