我运行了一个python程序,该程序调用sklearn.metrics
的方法来计算精度和F1分数。以下是没有预测样本时的输出:
/xxx/py2-scikit-learn/0.15.2-comp6/lib/python2.6/site-packages/sklearn/metr
ics/metrics.py:1771: UndefinedMetricWarning: Precision is ill-defined and being set to 0.0 due to no predicted samples.
'precision', 'predicted', average, warn_for)
/xxx/py2-scikit-learn/0.15.2-comp6/lib/python2.6/site-packages/sklearn/metr
ics/metrics.py:1771: UndefinedMetricWarning: F-score is ill-defined and being set to 0.0 due to no predicted samples.
'precision', 'predicted', average, warn_for)
当没有预测样本时,意味着TP+FP为0,因此
- 精度(定义为TP/(TP+FP))是0/0
- 如果FN不为零,则F1得分(定义为2TP/(2TP+FP+FN))为0
在我的情况下,sklearn.metrics
也将精度返回为0.8,并将召回率返回为0。所以FN不是零。
但为什么scikilearn说F1定义不清呢?
Scikilearn对F1的定义是什么?
https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/metrics/classification.py
F1=2*(精度*召回)/(精度+召回)
精度=TP/(TP+FP),正如您刚才所说的,如果预测器根本不能预测正类,则精度为0。
查全率=TP/(TP+FN),如果预测器不能预测阳性类别——TP为0——查全率为0。
所以现在你正在除以0/0。
精度、召回率、F1分数和准确度计算
- In a given image of Dogs and Cats
* Total Dogs - 12 D = 12
* Total Cats - 8 C = 8
- Computer program predicts
* Dogs - 8
5 are actually Dogs T.P = 5
3 are not F.P = 3
* Cats - 12
6 are actually Cats T.N = 6
6 are not F.N = 6
- Calculation
* Precision = T.P / (T.P + F.P) => 5 / (5 + 3)
* Recall = T.P / D => 5 / 12
* F1 = 2 * (Precision * Recall) / (Precision + Recall)
* F1 = 0.5
* Accuracy = T.P + T.N / P + N
* Accuracy = 0.55
维基百科参考