我是python的新手,在做逻辑回归时,我遇到了一些问题,例如显示。下面是我的代码,然后是错误消息:
import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
%matplotlib inline
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score
from sklearn.metrics import roc_curve, roc_auc_score
X = dataset_df
Y = dataset_df
'X_train, X_test, y_train, y_test
= train_test_split(X, y, test_size = 0.3, random_state=1)'
X_train, X_validation, y_train, y_validation
= train_test_split(X_train, y_train, test_size = 0.3, random_state=1)
sc = StandardScaler()
sc.fit(X_train)
X_train_Std = sc.transform(X_train)
lr_classifier = LogisticRegression(C = 1000, random_state= 1)
rf_classifier = RandomForestClassifier(max_depth=5, random_state= 1)
rf_classifier.fit(X_train_Std, y_train)
rf_classifier.predict_proba(sc.transform(X_validation))
这里
roc_auc_score(y_true=y_test, y_score=lr_2.predict(X_test_std_pca_1))
NameError: name 'lr_2' is not defined
和
max_depth_params = [2, 3, 5 ,10]
for max_depth in max_depth_params:
rf_classifier = RandomForestClassifier(max_depth=max_depth, random_state= 1)
rf_classifier.fit(X_train_Std, y_train)
y_pred2 = rf_classifier.predict(sc.transform(X_validation))
print('max depth param:', max_depth, 'accuracy:', accuracy_score(y_true=y_validation, y_pred=y_pred2))
ValueError: multiclass-multioutput is not supported
和
lr_classifier.fit(X_train_Std, y_train)
y_pred = lr_classifier.predict(sc.transform(X_validation))
ValueError: y应该是一维数组,得到了一个形状为(3876,16)的数组
,最后:
y_pred2 = rf_classifier.predict(sc.transform(X_validation))
print('Misclassified samples {0} out of {1}, i.e. {2:.2f}% accurate'.
format((y_validation != y_pred).sum(), len(y_validation), (1 - (y_validation != y_pred).sum()/len(y_validation))*100))
TypeError:unsupported format string passed to Series.format
这么多错误信息,我觉得好像我的头要爆炸了,如果有人能帮助我,我将非常感激🙏
- 变量名错误
lr_2
不是变量名,您已将其定义为lr_classifier
- 您的
Target
或y
不是单列,而是2D (Multiclass
) - 它清楚地说明
y
不是1D array
的问题 - 不支持的数据类型可能是一个
Series
。
试着一次调试一个错误,在去Machine Learning
之前先学习python
基础。