对于数据测试结果有下面4种情况:
真阳性(TP): 预测为正, 实际也为正
假阳性(FP): 预测为正, 实际为负
假阴性(FN): 预测为负,实际为正
真阴性(TN): 预测为负, 实际也为负
准确率(P) : TP/ (TP+FP)
召回率(R) : TP(TP + FN)
F1-score : 2(PR)/(P+R
f1_score原型:
sklearn.metrics.f1_score(y_true,
y_pred,
labels=None,
pos_label=1,
average=’binary’,
sample_weight=None)
参数详解
1. y_true: 数据真实标签 Ground truth (correct) target values.
2. y_pred : 分类器分类标签 Estimated targets as returned by a classifier.
3. average : [None, ‘binary’ (default), ‘micro’, ‘macro’, ‘samples’, ‘weighted’]
多类/多标签目标需要此参数。默认为‘binary’,即二分类
4. labels : 类别标签,可不填
实际数据:1, 2, 3, 4, 5, 6, 7, 8, 9
真实类别:A, A, A, A, B, B, B, C, C
预测类别:A, A, B, C, B, B, C, B, C
则各个类别的真假阳阴性:
A B C 总计
TP 2 2 1 5
FP 0 2 2 4
FN 2 1 1 4
A类别的准确率 : PA = 2/(2+0)= 1
A类别的召回率 : RA = 2/(2+2)= 0.5
A类别的F1-score : FA = 2*(1*0.5)/(1+0.5) = 0.667
B类别的准确率 : PB = 2/(2+2)= 0.5
B类别的召回率 : RB = 2/(2+1)= 0.667
B类别的F1-score : FB = 2*(0.5*0.667)/(0.5+0.667) = 0.572
C类别的准确率 : PC = 1/(1+2)= 0.333
C类别的召回率 : RC = 1/(1+1)= 0.5
C类别的F1-score : FC = 2*(0.333*0.5)/(0.333+0.5) = 0.39976
所有数据的F1-score:
有两种方式
第一种方式是计算数据中所有的TP,FP,FN,然后计算F1-score,即micro;
第二种方式是分别计算各个类别的TP,FP,FN,然后计算各个类被的F1-score,然后对F-score求平均,即macro.
micro:
P = 5/(5+4) = 0.556
R = 5/(5+4) = 0.556
F1-score = 2*(0.556*0.556)/(0.556+0.556) = 0.556
macro :
F1-score = (0.667+0.572+0.39976)/3 = 0.5462
为了使用sklearn的接口,分别使用1表示A类别,2表示B类别,3表示C类别,则
真实类别y_true : 1, 1, 1, 1, 2, 2, 2, 3, 3
预测类别 y_pred : 1, 1, 2, 3, 2, 2, 3, 2, 3
from sklearn.metrics import f1_score
y_true = [1, 1, 1, 1, 2, 2, 2, 3, 3]
y_pred = [1, 1, 2, 3, 2, 2, 3, 2, 3]
f1_micro = f1_score(y_true,y_pred,average='micro')
f1_macro = f1_score(y_true,y_pred,average='macro')
print('f1_micro: {0}'.format(f1_micro))
print('f1_macro: {0}'.format(f1_macro))
输出结果为:
f1_micro: 0.5555555555555556
f1_macro: 0.546031746031746
可见,与手动计算的结果一样
也可以使用sklearn计算各个类别的召回率准确率等
# 计算各个类别的准确率,召回率,与F1-score
from sklearn.metrics import precision_recall_fscore_support
y_true = [1, 1, 1, 1, 2, 2, 2, 3, 3]
y_pred = [1, 1, 2, 3, 2, 2, 3, 2, 3]
p_class, r_class, f_class, support_micro = precision_recall_fscore_support(y_true,y_pred,labels=[1,2,3])
print(p_class)
print(r_class)
print(f_class)
print(support_micro)
输出结果:
[1. 0.5 0.33333333]
[0.5 0.66666667 0.5 ]
[0.66666667 0.57142857 0.4 ]
[4 3 2]
与手动计算的一样。
注:
precision_recall_fscore_support返回每个类别的准确率,召回率,f1-score,各个指标中,每个类别的顺序与参数labels的顺序相同,比如:
p_class = [1. 0.5 0.33333333]
labels=[1,2,3]
则
类别1的准确率是1
类别2的准确率是0.5
类别3的准确率是0.3333333