method='predict_proba' 表示cross_val_predict返回"index 1 is out of bounds for axis 1 with size 1"

这是数据集的一部分：

a b c result
0 1 1  positive
0 0 1  negative
0 1 1  negative
0 0 0  positive

result = [1 if v=='positive' else 0 for v in data['result'].tolist()]
Output = result
X = data["a", "b", "c"]
y = np.reshape(Output, (X.shape[0], 1))

我正在尝试使用sklearn中的交叉验证方法来预测X数据的类别：这部分代码有效：

logreg = LogisticRegression('l2')
y_pred_class = cross_val_predict(logreg, X, y, cv=10, method= 'predict' )

但是当我想使用以下代码计算类的概率时：

y_pred_prob = cross_val_predict(logreg, X, y, cv=10, method='predict_proba')

它有这个错误：

index 1 is out of bounds for axis 1 with size 1

你知道问题出在哪里吗？

当您呼叫method="predict"时，会收到警告：

DataConversionWarning:当需要1d数组时，传递了列向量y。请将y的形状更改为(n_samples，(，例如使用ravel((。返回f(**kwargs(/usr/local/lib/python3.8/dist packages/sk-learn/utils/validation.py:72:DataConversionWarning：当需要1d数组时，传递了列向量y。请将y的形状更改为(n_samples，(，例如使用ravel((。返回f(**kwargs(

如果您只是注意到该警告，它将解决method="predict_proba"中的错误。你所需要做的就是改变这条线

y = np.reshape(Output, (X.shape[0], 1))

至

y = np.reshape(Output, (X.shape[0],))

甚至

y = np.array(result)

或者甚至不必为理解列表而烦恼，而是呆在熊猫身上：

y = data["result"].replace({"positive": 1, "negative": 0})

相关内容

最新更新

热门标签：