作为scikit学习和尝试对虹膜数据集进行分类的初学者,在交叉验证步骤中,我在将评分指标从scoring='accuracy'
调整到其他方面(如精度、召回率、f1等)时遇到了问题。下面是完整的代码示例(足以从# Test options and evaluation metric
开始)。
# Load libraries
import pandas
from pandas.plotting import scatter_matrix
import matplotlib.pyplot as plt
from sklearn import model_selection # for command model_selection.cross_val_score
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.naive_bayes import GaussianNB
from sklearn.svm import SVC
# Load dataset
url = "https://raw.githubusercontent.com/jbrownlee/Datasets/master/iris.csv"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
dataset = pandas.read_csv(url, names=names)
# Split-out validation dataset
array = dataset.values
X = array[:,0:4]
Y = array[:,4]
validation_size = 0.20
seed = 7
X_train, X_validation, Y_train, Y_validation = model_selection.train_test_split(X, Y, test_size=validation_size, random_state=seed)
# Test options and evaluation metric
seed = 7
scoring = 'accuracy'
#Below, we build and evaluate 6 different models
# Spot Check Algorithms
models = []
models.append(('LR', LogisticRegression()))
models.append(('LDA', LinearDiscriminantAnalysis()))
models.append(('KNN', KNeighborsClassifier()))
models.append(('CART', DecisionTreeClassifier()))
models.append(('NB', GaussianNB()))
models.append(('SVM', SVC()))
# evaluate each model in turn, we calculate the cv-scores, ther mean and std for each model
#
results = []
names = []
for name, model in models:
#below, we do k-fold cross-validation
kfold = model_selection.KFold(n_splits=10, random_state=seed)
cv_results = model_selection.cross_val_score(model, X_train, Y_train, cv=kfold, scoring=scoring)
results.append(cv_results)
names.append(name)
msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std())
print(msg)
现在,除了得分="准确度"之外,我还想评估这个多类分类问题的其他性能指标。但当我使用scoring="recision"时,它会引发:
ValueError: Target is multiclass but average='binary'. Please choose another average setting.
我的问题是:
1) 我想发生上述情况是因为scikit learn中定义了"精度"one_answers"召回率",仅用于二进制分类,这是正确的吗?如果是,那么,在上面的代码中,哪个命令应该替换scoring='accuracy'
?
2) 如果我想在执行k折叠交叉验证时计算每个折叠的混淆矩阵、精度和召回率,我应该键入什么命令?
3) 为了进行实验,我尝试了评分="平衡精度",结果发现:
ValueError: 'balanced_accuracy' is not a valid scoring value.
当模型评估文档(https://scikit-learn.org/stable/modules/model_evaluation.html)清楚地说balanced_accurcy是一种评分方法?我在这里很困惑,所以一个实际的代码来展示如何评估其他性能指标将不胜感激!感谢客栈提前!!
1)我想之所以会发生上述情况,是因为scikit learn中只为二进制分类定义了"precision"one_answers"recall",这是正确的吗?
否。精度&回忆当然也适用于多类问题——请参阅文档中的精度&回忆起
如果是,那么,在上面的代码中,哪个命令应该替换scoring="accuracy"?
问题的出现是因为,正如您从我上面提供的文档链接中看到的,这些度量的默认设置是二进制分类(average='binary'
)。在多类分类的情况下,您需要指定您感兴趣的特定度量的确切"版本"(有多个);请查看scikit学习文档的相关页面,但scoring
参数的一些有效选项可能是:
'precision_macro'
'precision_micro'
'precision_weighted'
'recall_macro'
'recall_micro'
'recall_weighted'
上面的文档链接甚至包含了一个使用'recall_macro'
和虹膜数据的例子——一定要检查一下
2)如果我想在执行k折叠交叉验证时计算每个折叠的混淆矩阵、精度和召回率,我应该键入什么命令?
这并不是一件小事,但你可以在我对scikit中交叉验证指标的回答中看到一种方法,即为每个数据分割学习
3)为了实验,我尝试了评分="balanced_accurrency",结果发现:
ValueError: 'balanced_accuracy' is not a valid scoring value.
这是因为您可能使用的是scikit learn的旧版本。balanced_accuracy
仅在v0.20中可用-您可以验证它在v0.18中不可用。升级你的scikit-learn到v0.20,你应该会没事的。