如何在多类分类中计算不平衡精度度量



很抱歉打扰了,但我发现了一篇有趣的文章"Mortaz,E.(2020(。多类不平衡分类问题中模型选择的不平衡精度度量。基于知识的系统,210106490";(https://www.sciencedirect.com/science/article/pii/S0950705120306195)在那里他们计算了这个度量(IAM((公式在论文中,我理解它(,但我想问:我如何在R上复制它?

我为这个愚蠢的问题提前道歉。感谢您的关注!

文章中提供的IAM公式为:IAM公式

其中cij是分类器的混淆矩阵(c(中的元素(i,j(。k是指分类中的类的数量(k>=2(。结果表明,该度量可以作为多类模型选择中的一个单独度量。

以下提供了在python中实现IAM(不平衡精度度量(的代码:

def IAM(c):
'''
c is a nested list presenting the confusion matrix of the classifier (len(c)>=2)
'''
l  = len(c)
iam = 0
for i in range(l):
sum_row = 0
sum_col = 0
sum_row_no_i = 0
sum_col_no_i = 0
for j in range(l):
sum_row += c[i][j]
sum_col += c[j][i]
if j is not i:
sum_row_no_i += c[i][j] 
sum_col_no_i += c[j][i]
iam += (c[i][i] - max(sum_row_no_i, sum_col_no_i))/max(sum_row, sum_col)
return   iam/l
c = [[2129,   52,    0,    1],
[499,   70,    0,    2],
[46,   16,    0,   1],
[85,   18,    0,   7]]
IAM(c) = -0.5210576475801445

以下提供了在R中实现IAM(不平衡精度度量(的代码:

IAM <- function(c) {
# c is a matrix representing the confusion matrix of the classifier.
l <- nrow(c)
result = 0

for (i in 1:l) {
sum_row = 0
sum_col = 0
sum_row_no_i = 0
sum_col_no_i = 0
for (j in 1:l){
sum_row = sum_row + c[i,j]
sum_col = sum_col + c[j,i]
if(i != j)  {
sum_row_no_i = sum_row_no_i + c[i,j] 
sum_col_no_i = sum_col_no_i + c[j,i]
}
}
result = result + (c[i,i] - max(sum_row_no_i, sum_col_no_i))/max(sum_row, sum_col)
}
return(result/l)
}
c <- matrix(c(2129,52,0,1,499,70,0,2,46,16,0,1,85,18,0,7), nrow=4, ncol=4)
IAM(c) = -0.5210576475801445

虹膜数据集的另一个例子(3类问题(和sklearn:

from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix
X, y = load_iris(return_X_y=True)
clf = LogisticRegression(max_iter = 1000).fit(X, y)
pred = clf.predict(X)
c = confusion_matrix(y, pred)
print('confusion matrix:')
print(c)
print(f'accuarcy : {clf.score(X, y)}')
print(f'IAM : {IAM(c)}')
confusion matrix:
[[50  0  0]
[ 0 47  3]
[ 0  1 49]]
accuarcy : 0.97
IAM : 0.92

相关内容

  • 没有找到相关文章

最新更新