逻辑回归- multiclass-multioutput不支持+错误



我是python的新手,在做逻辑回归时,我遇到了一些问题,例如显示。下面是我的代码,然后是错误消息:

import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score
from sklearn.metrics import roc_curve, roc_auc_score

X = dataset_df
Y = dataset_df
'X_train, X_test, y_train, y_test
= train_test_split(X, y, test_size = 0.3, random_state=1)'
X_train, X_validation, y_train, y_validation
= train_test_split(X_train, y_train, test_size = 0.3, random_state=1)
sc = StandardScaler()
sc.fit(X_train)
X_train_Std = sc.transform(X_train)
lr_classifier = LogisticRegression(C = 1000, random_state= 1)
rf_classifier = RandomForestClassifier(max_depth=5, random_state= 1)
rf_classifier.fit(X_train_Std, y_train)
rf_classifier.predict_proba(sc.transform(X_validation))

这里

roc_auc_score(y_true=y_test, y_score=lr_2.predict(X_test_std_pca_1))

NameError: name 'lr_2' is not defined

max_depth_params = [2, 3, 5 ,10]
for max_depth in max_depth_params:
rf_classifier = RandomForestClassifier(max_depth=max_depth, random_state= 1)
rf_classifier.fit(X_train_Std, y_train)
y_pred2 = rf_classifier.predict(sc.transform(X_validation))

print('max depth param:', max_depth, 'accuracy:', accuracy_score(y_true=y_validation, y_pred=y_pred2))

ValueError: multiclass-multioutput is not supported

lr_classifier.fit(X_train_Std, y_train)
y_pred = lr_classifier.predict(sc.transform(X_validation))

ValueError: y应该是一维数组,得到了一个形状为(3876,16)的数组

,最后:

y_pred2 = rf_classifier.predict(sc.transform(X_validation))
print('Misclassified samples {0} out of {1}, i.e. {2:.2f}% accurate'.
format((y_validation != y_pred).sum(), len(y_validation), (1 - (y_validation != y_pred).sum()/len(y_validation))*100))

TypeError:unsupported format string passed to Series.format

这么多错误信息,我觉得好像我的头要爆炸了,如果有人能帮助我,我将非常感激🙏

  1. 变量名错误lr_2不是变量名,您已将其定义为lr_classifier
  2. 您的Targety不是单列,而是2D (Multiclass)
  3. 它清楚地说明y不是1D array的问题
  4. 不支持的数据类型可能是一个Series

试着一次调试一个错误,在去Machine Learning之前先学习python基础。

最新更新