python多标签分类的评价指标_多分类多标签模型的评估方式(定义+numpy代码实现)...

一、Multi-Class Multi-Label问题定义

所谓多分类(Multi-Class)是区别于二分类的一个概念,在二分类问题当中,数据的标签只是0,1二值类型,比如“是否”是一只狗,“是否”患病。而多分类则对应于更多的类别,比如判断物体是猫,狗,鸟,兔......判断病人患的是A,B,C,D中的某一种病。值得注意的是,多分类问题中常常只有一个类别是正确的。

什么是多标签(Multi-Label)呢?简单来说,就是一个样本同时具有多个标签,例如一张风景图,里面有天空、猫、狗、鸟、树,如果这些类别都属于当前任务所需要识别的类别之内,那么它就具有多个标签。显然,多标签任务的难度要高的多。

二、评估方式

参考[1] [2],多分类多标签模型的评估指标通常分为两大类: example-based metrics, label-based metrics。

Example-based Metrics

1.Subset accuracy

其中

指代一个多标签分类器

返回预测的标签集合,

为样本个数。

# gt为真实标签,predict为预测标签

# 形式例如:gt=[[1,0,0,1]], predict=[[1,0,1,1]]

def example_subset_accuracy(gt, predict):

ex_equal = np.all(np.equal(gt, predict), axis=1).astype("float32")

return np.mean(ex_equal)

2.Example accuracy

def example_accuracy(gt, predict):

ex_and = np.sum(np.logical_and(gt, predict), axis=1).astype("float32")

ex_or = np.sum(np.logical_or(gt, predict), axis=1).astype("float32")

return np.mean(ex_and / (ex_or+epsilon))

3.Example precision

def example_precision(gt, predict):

ex_and = np.sum(np.logical_and(gt, predict), axis=1).astype("float32")

ex_predict = np.sum(predict, axis=1).astype("float32")

return np.mean(ex_and / (ex_predict + epsilon))

4.Example recall

def example_recall(gt, predict):

ex_and = np.sum(np.logical_and(gt, predict), axis=1).astype("float32")

ex_gt = np.sum(gt, axis=1).astype("float32")

return np.mean(ex_and / (ex_gt + epsilon))

5.Example F1

度量查全率(recall)对查准率(precision)的相对重要性,

时退化为标准的F1,

时查全率有更大影响,

时查准率有大更影响。

equation?tex=F_%7B%5Ctext+%7Bexam+%7D%7D%5E%7B%5Cbeta%7D%28h%29%3D%5Cfrac%7B%5Cleft%281%2B%5Cbeta%5E%7B2%7D%5Cright%29+%5Ccdot+%5Ctext+%7B+Precsion+%7D_%7B%5Ctext+%7Bexam+%7D%7D%28h%29+%5Ccdot+%5Ctext+%7B+Recall+%7D_%7B%5Ctext+%7Bexam+%7D%7D%28h%29%7D%7B%5Cbeta%5E%7B2%7D+%5Ccdot+%5Ctext+%7B+Precision+%7D_%7B%5Ctext+%7Bexam+%7D%7D%28h%29%2B%5Ctext+%7B+Recall+%7D_%7B%5Ctext+%7Bexam+%7D%7D%28h%29%7D

def example_f1(gt, predict, beta=1):

p = example_precision(gt, predict)

r = example_recall(gt, predict)

return ((1+beta**2) * p * r) / ((beta**2)*(p + r + epsilon))

Label-based Metrics

在计算label-based metrics之前,需要计算所需的基本统计量 的计算

equation?tex=%5Cbegin%7Barray%7D%7Bl%7DT+P_%7Bj%7D%3D%5Cleft%7C%5Cleft%5C%7Bx_%7Bi%7D+%5Cmid+y_%7Bj%7D+%5Cin+Y_%7Bi%7D+%5Cwedge+y_%7Bj%7D+%5Cin+h%5Cleft%28x_%7Bi%7D%5Cright%29%2C+1+%5Cleq+i+%5Cleq+p%5Cright%5C%7D%5Cright%7C+%5C%5C+F+P_%7Bj%7D%3D%5Cleft%7C%5Cleft%5C%7Bx_%7Bi%7D+%5Cmid+y_%7Bj%7D+%5Cnotin+Y_%7Bi%7D+%5Cwedge+y_%7Bj%7D+%5Cin+h%5Cleft%28x_%7Bi%7D%5Cright%29%2C+1+%5Cleq+i+%5Cleq+p%5Cright%5C%7D%5Cright%7C+%5C%5C+T+N_%7Bj%7D%3D%5Cleft%7C%5Cleft%5C%7Bx_%7Bi%7D+%5Cmid+y_%7Bj%7D+%5Cnotin+Y_%7Bi%7D+%5Cwedge+y_%7Bj%7D+%5Cnotin+h%5Cleft%28x_%7Bi%7D%5Cright%29%2C+1+%5Cleq+i+%5Cleq+p%5Cright%5C%7D%5Cright%7C+%5C%5C+F+N_%7Bj%7D%3D%5Cleft%7C%5Cleft%5C%7Bx_%7Bi%7D+%5Cmid+y_%7Bj%7D+%5Cin+Y_%7Bi%7D+%5Cwedge+y_%7Bj%7D+%5Cnotin+h%5Cleft%28x_%7Bi%7D%5Cright%29%2C+1+%5Cleq+i+%5Cleq+p%5Cright%5C%7D%5Cright%7C%5Cend%7Barray%7D

其中

代表样本个数,

代表第

个类别的真实标签,

四类基本参数代表各自类别的二元分类性能,满足

def _label_quantity(gt, predict):

tp = np.sum(np.logical_and(gt, predict), axis=0)

fp = np.sum(np.logical_and(1-gt, predict), axis=0)

tn = np.sum(np.logical_and(1-gt, 1-predict), axis=0)

fn = np.sum(np.logical_and(gt, 1-predict), axis=0)

return np.stack([tp, fp, tn, fn], axis=0).astype("float")Accuracy, Precision, Recall,F1的计算

equation?tex=%5Cbegin%7Barray%7D%7Bc%7D%5Ctext+%7B+Accuracy+%7D%5Cleft%28T+P_%7Bj%7D%2C+F+P_%7Bj%7D%2C+T+N_%7Bj%7D%2C+F+N_%7Bj%7D%5Cright%29%3D%5Cfrac%7BT+P_%7Bj%7D%2BT+N_%7Bj%7D%7D%7BT+P_%7Bj%7D%2BF+P_%7Bj%7D%2BT+N_%7Bj%7D%2BF+N_%7Bj%7D%7D+%5C%5C+%5Ctext+%7B+Precision+%7D%5Cleft%28T+P_%7Bj%7D%2C+F+P_%7Bj%7D%2C+T+N_%7Bj%7D%2C+F+N_%7Bj%7D%5Cright%29%3D%5Cfrac%7BT+P_%7Bj%7D%7D%7BT+P_%7Bj%7D%2BF+P_%7Bj%7D%7D+%5C%5C+%5Coperatorname%7BRecall%7D%5Cleft%28T+P_%7Bj%7D%2C+F+P_%7Bj%7D%2C+T+N_%7Bj%7D%2C+F+N_%7Bj%7D%5Cright%29%3D%5Cfrac%7BT+P_%7Bj%7D%7D%7BT+P_%7Bj%7D%2BF+N_%7Bj%7D%7D+%5C%5C+F%5E%7B%5Cbeta%7D%5Cleft%28T+P_%7Bj%7D%2C+F+P_%7Bj%7D%2C+T+N_%7Bj%7D%2C+F+N_%7Bj%7D%5Cright%29%3D%5Cfrac%7B%5Cleft%281%2B%5Cbeta%5E%7B2%7D%5Cright%29+%5Ccdot+T+P_%7Bj%7D%7D%7B%5Cleft%281%2B%5Cbeta%5E%7B2%7D%5Cright%29+T+P_%7Bj%7D%2B%5Cbeta%5E%7B2%7D+%5Ccdot+F+N_%7Bj%7D%2BF+P_%7Bj%7D%7D%5Cend%7Barray%7D

Marco平均与Micro平均

equation?tex=B_%7B%5Coperatorname%7Bmacro%7D%7D%28h%29%3D%5Cfrac%7B1%7D%7Bq%7D+%5Csum_%7Bj%3D1%7D%5E%7Bq%7D+B%5Cleft%28T+P_%7Bj%7D%2C+F+P_%7Bj%7D%2C+T+N_%7Bj%7D%2C+F+N_%7Bj%7D%5Cright%29+%5C%5C+B_%7B%5Ctext+%7Bmicro+%7D%7D%28h%29%3DB%5Cleft%28%5Csum_%7Bj%3D1%7D%5E%7Bq%7D+T+P_%7Bj%7D%2C+%5Csum_%7Bj%3D1%7D%5E%7Bq%7D+F+P_%7Bj%7D%2C+%5Csum_%7Bj%3D1%7D%5E%7Bq%7D+T+N_%7Bj%7D%2C+%5Csum_%7Bj%3D1%7D%5E%7Bq%7D+F+N_%7Bj%7D%5Cright%29

其中,

指代一种计算方法

。Macro指代对类作平均,Micro指代对样本作平均,

为总的类别数。Label accuracyMacro

def label_accuracy_macro(gt, predict):

quantity = _label_quantity(gt, predict)

tp_tn = np.add(quantity[0], quantity[2])

tp_fp_tn_fn = np.sum(quantity, axis=0)

return np.mean(tp_tn / (tp_fp_tn_fn + epsilon))Micro

def label_accuracy_micro(gt, predict):

quantity = _label_quantity(gt, predict)

sum_tp, sum_fp, sum_tn, sum_fn = np.sum(quantity, axis=1)

return (sum_tp + sum_tn) / (

sum_tp + sum_fp + sum_tn + sum_fn + epsilon)

2.Label precisionMacro

def label_precision_macro(gt, predict):

quantity = _label_quantity(gt, predict)

tp = quantity[0]

tp_fp = np.add(quantity[0], quantity[1])

return np.mean(tp / (tp_fp + epsilon))Micro

def label_precision_micro(gt, predict):

quantity = _label_quantity(gt, predict)

sum_tp, sum_fp, sum_tn, sum_fn = np.sum(quantity, axis=1)

return sum_tp / (sum_tp + sum_fp + epsilon)

3.Label recallMacro

def label_recall_macro(gt, predict):

quantity = _label_quantity(gt, predict)

tp = quantity[0]

tp_fn = np.add(quantity[0], quantity[3])

return np.mean(tp / (tp_fn + epsilon))Micro

def label_recall_micro(gt, predict):

quantity = _label_quantity(gt, predict)

sum_tp, sum_fp, sum_tn, sum_fn = np.sum(quantity, axis=1)

return sum_tp / (sum_tp + sum_fn + epsilon)

4.Label F1Macro

def label_f1_macro(gt, predict, beta=1):

quantity = _label_quantity(gt, predict)

tp = quantity[0]

fp = quantity[1]

fn = quantity[3]

return np.mean((1 + beta**2) * tp / ((1 + beta**2) * tp + beta**2 * fn + fp + epsilon))Micro

def label_f1_micro(gt, predict, beta=1):

quantity = _label_quantity(gt, predict)

tp = np.sum(quantity[0])

fp = np.sum(quantity[1])

fn = np.sum(quantity[3])

return (1 + beta**2) * tp / ((1 + beta**2) * tp + beta**2 * fn + fp + epsilon)

注:epsilon设置为如1e-8的常数防止zero-division的情况发生。

Reference

[1] M. Zhang and Z. Zhou, "A Review on Multi-Label Learning Algorithms," in IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 8, pp. 1819-1837, Aug. 2014, doi: 10.1109/TKDE.2013.39.

[2]Wei Long, Yang Yang, Hong-Bin Shen, ImPLoc: a multi-instance deep learning model for the prediction of protein subcellular localization based on immunohistochemistry images, Bioinformatics, Volume 36, Issue 7, 1 April 2020, Pages 2244–2250, https://doi.org/10.1093/bioinformatics/btz909

你可能感兴趣的:(python多标签分类的评价指标_多分类多标签模型的评估方式(定义+numpy代码实现)...)