在目标检测领域,衡量一个模型的优劣的指标往往是mAP,然而实际工程中,有时候更倾向于看漏检率和虚检率。YOLOv5的原始代码并没有这两个指标的输出,因此我想利用原始代码的混淆矩阵,输出这两个指标数值。
漏检即原本有目标存在却没有检测出来,换句话说就是原本是目标却检测成了背景。
虚检(虚警)即原本没有目标却误认为有目标,换句话说就是原本是背景却检测成了目标。
首先来看YOLOv5原本输出的混淆矩阵,图中灰色覆盖的地方是原本输出的各类别,也就是输出的正例,最后一行和一列是背景类。
列是模型预测的结果,行是标签的真实结果。可以看到最后一行出现数值,表示出现了漏检;最后一列出现数值,则表示出现了虚检。
现在来看YOLOv5输出的混淆矩阵代码部分,代码主要位于metrics.py
的ConfusionMatrix
类中。
class ConfusionMatrix:
# Updated version of https://github.com/kaanakan/object_detection_confusion_matrix
def __init__(self, nc, conf=0.25, iou_thres=0.45):
"""
params nc: 数据集类别个数
params conf: 预测框置信度阈值
Params iou_thres: iou阈值
"""
self.matrix = np.zeros((nc + 1, nc + 1)) # +1的目的是添加背景类
self.nc = nc # number of classes
self.conf = conf
self.iou_thres = iou_thres
self.lou = 0
self.total = 0
self.xu = 0
def process_batch(self, detections, labels):
"""
Return intersection-over-union (Jaccard index) of boxes.
Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
Arguments:
detections (Array[N, 6]), x1, y1, x2, y2, conf, class
labels (Array[M, 5]), class, x1, y1, x2, y2
Returns:
None, updates confusion matrix accordingly
"""
detections = detections[detections[:, 4] > self.conf] # 筛除置信度过低的预测框(和nms差不多)
gt_classes = labels[:, 0].int()
detection_classes = detections[:, 5].int()
iou = general.box_iou(labels[:, 1:], detections[:, :4])
x = torch.where(iou > self.iou_thres)
if x[0].shape[0]:
matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy()
if x[0].shape[0] > 1:
matches = matches[matches[:, 2].argsort()[::-1]]
matches = matches[np.unique(matches[:, 1], return_index=True)[1]]
matches = matches[matches[:, 2].argsort()[::-1]]
matches = matches[np.unique(matches[:, 0], return_index=True)[1]]
else:
matches = np.zeros((0, 3))
n = matches.shape[0] > 0
m0, m1, _ = matches.transpose().astype(np.int16)
for i, gc in enumerate(gt_classes):
j = m0 == i
if n and sum(j) == 1:
# 如果sum(j)=1 说明gt[i]这个真实框被某个预测框检测到了
self.matrix[gc, detection_classes[m1[j]]] += 1 # correct
else:
# 如果sum(j)=0 说明gt[i]这个真实框没用被任何预测框检测到 也就是说这个真实框被检测成了背景框
self.matrix[self.nc, gc] += 1 # background FP
if n:
for i, dc in enumerate(detection_classes):
if not any(m1 == i):
self.matrix[dc, self.nc] += 1 # background FN
self.lou = sum(self.matrix[-1, :])
self.total = sum(sum(self.matrix))
self.xu = sum(self.matrix[:, -1])
def matrix(self):
return self.matrix
def plot(self, save_dir='', names=()):
try:
import seaborn as sn
# 按照每一列进行归一化
array = self.matrix / (self.matrix.sum(0).reshape(1, self.nc + 1) + 1E-6) # normalize
array[array < 0.005] = np.nan # don't annotate (would appear as 0.00)
fig = plt.figure(figsize=(12, 9), tight_layout=True)
sn.set(font_scale=1.0 if self.nc < 50 else 0.8) # for label size
labels = (0 < len(names) < 99) and len(names) == self.nc # apply names to ticklabels
sn.heatmap(array, annot=self.nc < 30, annot_kws={"size": 8}, cmap='Blues', fmt='.2f', square=True,
xticklabels=names + ['background FP'] if labels else "auto",
yticklabels=names + ['background FN'] if labels else "auto").set_facecolor((1, 1, 1))
fig.axes[0].set_xlabel('True')
fig.axes[0].set_ylabel('Predicted')
fig.savefig(Path(save_dir) / 'confusion_matrix.png', dpi=250)
except Exception as e:
pass
def print(self):
for i in range(self.nc + 1):
print(' '.join(map(str, self.matrix[i])))
阅读代码可以发现,混淆矩阵再绘制时对每一列单独进行了归一化,那么再绘制之前,混淆矩阵存储了每一个预测结果和真实结果的数目。
于是我添加了三个属性self.lou
、self.total = 0
、self.xu = 0
,分别统计漏检目标数目,总目标数目和虚检目标数目。
漏检目标数目只需要将混淆矩阵最后一行相加,虚检目标数目只需要将混淆矩阵最后一列相加,总目标数目则将混淆矩阵所有数量相加。
然后在test.py
中进行添加:
# Print speeds
t = tuple(x / seen * 1E3 for x in (t0, t1, t0 + t1)) + (imgsz, imgsz, batch_size) # tuple
if not training:
print('Speed: %.1f/%.1f/%.1f ms inference/NMS/total per %gx%g image at batch-size %g' % t)
# 计算漏检率
print("漏检样本数为:")
print(int(confusion_matrix.lou))
print("漏检率为:")
print(confusion_matrix.lou / confusion_matrix.total)
# 计算虚检率
print("虚检样本数为:")
print(int(confusion_matrix.xu))
print("虚检率为:")
print(confusion_matrix.xu / confusion_matrix.total)
# Plots
if plots:
confusion_matrix.plot(save_dir=save_dir, names=list(names.values()))
if wandb_logger and wandb_logger.wandb:
val_batches = [wandb_logger.wandb.Image(str(f), caption=f.name) for f in sorted(save_dir.glob('test*.jpg'))]
wandb_logger.log({"Validation": val_batches})
if wandb_images:
wandb_logger.log({"Bounding Box Debugger/Images": wandb_images})
输出效果:
2022.8.8更
突然想到前面的代码有个Bug,计算漏检率不应该采用混淆矩阵的全部内容,而只需采用混淆矩阵中的正例样本数目,否则分母将虚检的目标也混合进去,导致结果偏小。
metrics.py
修改:
class ConfusionMatrix:
# Updated version of https://github.com/kaanakan/object_detection_confusion_matrix
def __init__(self, nc, conf=0.25, iou_thres=0.45):
"""
params nc: 数据集类别个数
params conf: 预测框置信度阈值
Params iou_thres: iou阈值
"""
self.matrix = np.zeros((nc + 1, nc + 1)) # +1的目的是添加背景类
self.nc = nc # number of classes
self.conf = conf
self.iou_thres = iou_thres
self.lou = 0
self.total = 0
self.xu = 0
self.class_total = 0
def process_batch(self, detections, labels):
"""
Return intersection-over-union (Jaccard index) of boxes.
Both sets of boxes are expected to be in (x1, y1, x2, y2) format.
Arguments:
detections (Array[N, 6]), x1, y1, x2, y2, conf, class
labels (Array[M, 5]), class, x1, y1, x2, y2
Returns:
None, updates confusion matrix accordingly
"""
detections = detections[detections[:, 4] > self.conf] # 筛除置信度过低的预测框(和nms差不多)
gt_classes = labels[:, 0].int()
detection_classes = detections[:, 5].int()
iou = general.box_iou(labels[:, 1:], detections[:, :4])
x = torch.where(iou > self.iou_thres)
if x[0].shape[0]:
matches = torch.cat((torch.stack(x, 1), iou[x[0], x[1]][:, None]), 1).cpu().numpy()
if x[0].shape[0] > 1:
matches = matches[matches[:, 2].argsort()[::-1]]
matches = matches[np.unique(matches[:, 1], return_index=True)[1]]
matches = matches[matches[:, 2].argsort()[::-1]]
matches = matches[np.unique(matches[:, 0], return_index=True)[1]]
else:
matches = np.zeros((0, 3))
n = matches.shape[0] > 0
m0, m1, _ = matches.transpose().astype(np.int16)
for i, gc in enumerate(gt_classes):
j = m0 == i
if n and sum(j) == 1:
# 如果sum(j)=1 说明gt[i]这个真实框被某个预测框检测到了
self.matrix[gc, detection_classes[m1[j]]] += 1 # correct
else:
# 如果sum(j)=0 说明gt[i]这个真实框没用被任何预测框检测到 也就是说这个真实框被检测成了背景框
self.matrix[self.nc, gc] += 1 # background FP
if n:
for i, dc in enumerate(detection_classes):
if not any(m1 == i):
self.matrix[dc, self.nc] += 1 # background FN
self.lou = sum(self.matrix[-1, :])
self.total = sum(sum(self.matrix))
self.xu = sum(self.matrix[:, -1])
self.class_total = sum(sum(self.matrix)[: -1])
def matrix(self):
return self.matrix
def plot(self, save_dir='', names=()):
try:
import seaborn as sn
# 按照每一列进行归一化
array = self.matrix / (self.matrix.sum(0).reshape(1, self.nc + 1) + 1E-6) # normalize
array[array < 0.005] = np.nan # don't annotate (would appear as 0.00)
fig = plt.figure(figsize=(12, 9), tight_layout=True)
sn.set(font='SimHei', font_scale=1.0 if self.nc < 50 else 0.8) # for label size
labels = (0 < len(names) < 99) and len(names) == self.nc # apply names to ticklabels
sn.heatmap(array, annot=self.nc < 30, annot_kws={"size": 8}, cmap='Blues', fmt='.2f', square=True,
xticklabels=names + ['background FP'] if labels else "auto",
yticklabels=names + ['background FN'] if labels else "auto").set_facecolor((1, 1, 1))
fig.axes[0].set_xlabel('True')
fig.axes[0].set_ylabel('Predicted')
fig.savefig(Path(save_dir) / 'confusion_matrix.png', dpi=250)
except Exception as e:
pass
def print(self):
for i in range(self.nc + 1):
print(' '.join(map(str, self.matrix[i])))
test.py
修改:
# 计算漏检率
print("漏检样本数为:")
print(int(confusion_matrix.lou))
print("漏检率为:")
print(confusion_matrix.lou / confusion_matrix.class_total)
# 计算虚检率
print("虚检样本数为:")
print(int(confusion_matrix.xu))
print("虚检率为:")
print(confusion_matrix.xu / confusion_matrix.total)
2022.8.10更
突然想到一个问题:其实YOLOv5本身输出指标包含了准确率§和召回率®。
在一些博文中提到:漏检率=1-召回率,在YOLOv5中也可以这样理解吗?
回顾一下召回率的计算公式:R = TP / (TP+FN),通俗的说,召回率就是来衡量真实样本中,被检测正确的比例。
这里的TP(true positive)表示预测出的正确的框,即通过模型预测出的框,逐个与该图像的标注框求iou,如果与标注框产生的最大iou大于之前设置好的iou阈值,并且此预测框对应的标签与通过iou操作找到的标注框标签一致。
换句话说,YOLOv5的召回率分子的TP只有斜对角线上的值,检测出来但分类错误依然被视作FN,因此,自己计算的漏检率和召回率并不是严格互补的关系。
进一步思考,有没有办法让模型训练以降低漏检率为目标呢,也就是以召回率最高来保存模型。
先来看看YOLOv5模型的保存逻辑:
在train.py
中,定义了一个fi指标:
# Update best mAP
fi = fitness(np.array(results).reshape(1, -1)) # weighted combination of [P, R, [email protected], [email protected]]
if fi > best_fitness:
best_fitness = fi
wandb_logger.end_epoch(best_result=best_fitness == fi)
这个指标在metrics.py
中进行定义得到:
def fitness(x):
# Model fitness as a weighted combination of metrics
w = [0.0, 0.0, 0.1, 0.9] # weights for [P, R, [email protected], [email protected]:0.95]
return (x[:, :4] * w).sum(1)
也就是说,YOLOv5的模型保存逻辑实际上是0.1的[email protected] x [email protected]:0.95,这里定义了四个权重,以R为目标只需修改对应权重即可。