Scikit学习:partial_dependence只需要2个功能

我正在使用 sklean 14.1，我希望返回partial_plot值而不是使用 plot_partial_dependence 返回数字，所以我想也许我可以使用partial_dependence，但这里有一些麻烦。

似乎partial_dependence只需要两个功能，而我只需要一个功能的值。

当我修改scikit-learn网站提供的示例代码时：（将target_feature = （1,2）更改为target_feature = （1）），它抱怨：

*** ValueError: need more than 1 value to unpack

代码如下：

from sklearn.cross_validation import train_test_split
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.ensemble.partial_dependence import plot_partial_dependence
from sklearn.ensemble.partial_dependence import partial_dependence
from sklearn.datasets.california_housing import fetch_california_housing
cal_housing = fetch_california_housing()
X_train, X_test, y_train, y_test = train_test_split(cal_housing.data,
                                             cal_housing.target,test_size=0.2, 
                                             random_state=1)                                                                                                  
names = cal_housing.feature_names
clf = GradientBoostingRegressor(n_estimators=100, max_depth=4,   
                                learning_rate=0.1, loss='huber',random_state=1)                                 
clf.fit(X_train, y_train)
target_feature = (1)
pdp, (x_axis, y_axis) = partial_dependence(clf, target_feature, X=X_train, grid_resolution=50)

在源代码中，它说：

target_variables : array-like, dtype=int
    The target features for which the partial dependecy should be
    computed (size should be smaller than 3 for visual renderings).

谁能帮我弄清楚我做错了什么？或者帮助我提取我需要的一个功能的部分依赖值？

非常感谢！

这是Peter Prettenhofer对我电子邮件的回复。我在这里发帖，以防其他人也需要它。

这是问题所在：

左侧的结果假定结果是双向的部分依赖图，但它是一个单向 PDP。这应该可以解决它：
pdp, (x_axis, ) = partial_dependence(clf, target_feature, X=X_train, grid_resolution=50)

它工作得很好，非常感谢！

我认为问题是target_feature = (1)的计算结果是整数1而不是元组(1,) - 它一直在发生在我身上。因此，我主要使用列表（[1]）作为序列文字。

相关内容

最新更新

热门标签：