我试图理解如何为回归树(以及它们的集成对应体)计算特征重要性。我正在查看/sklearn/tree/_tree.pyx
中compute_feature_importances
函数的源代码,不能完全遵循逻辑-并且没有参考。
对不起,这可能是一个非常基本的问题,但我找不到一个好的文献参考,我希望有人可以为我指出正确的方向,或者快速解释代码,这样我就可以继续挖掘。
谢谢
参考文档而不是代码:
`feature_importances_` : array of shape = [n_features]
The feature importances. The higher, the more important the
feature. The importance of a feature is computed as the (normalized)
total reduction of the criterion brought by that feature. It is also
known as the Gini importance [4]_.
.. [4] L. Breiman, and A. Cutler, "Random Forests",
http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm