本篇文章综合的介绍下各种IOU,主要内容以代码为主,且代码按照公式的逻辑一步步写下来的,方便理解。在阅读过程中可能会发现在各种IOU的实现中有大量重复的代码,YOLOV5 ==> 链接 <== 中提供了
def bbox_iou
一种简洁统一的IOU的集成代码。
IOU种类主要包括有:IOU
、GIOU
、DIOU
、CIOU
。
IOU (Intersection over union)
交并比,是最为普通的IOU,用来描绘两个框之间的相交程度。用两个框的交集面积
除去并集面积
,具体公式如下:
I O U ( A , B ) = A ∩ B A ∪ B IOU\left(A, B\right)=\frac{A \cap B}{A \cup B} \quad IOU(A,B)=A∪BA∩B
由俭入奢,下面先介绍两个框之间的IOU如何计算,再介绍有一个GT面对多个预测框时的IOU如何计算。
首先,看下两个框(1个预测框Pre,1个真实框GT)
之间的IOU如何计算,代码如下:
def get_IoU_1pre_1gt(pred_bbox, gt_bbox):
"""
return the iou score
between 1 pred bbox (list)[x1, y1, x2, y2] and 1 gt bbox (list)[x1, y1, x2, y2]
"""
import numpy as np
pred_bbox, gt_bbox = np.array(pred_bbox), np.array(gt_bbox)
# -----0---- get [x1, y1, x2, y2] and [w, h] of inters
ixmin = max(pred_bbox[0], gt_bbox[0]) # x1 of the inter
iymin = max(pred_bbox[1], gt_bbox[1]) # y1 of the inter
ixmax = min(pred_bbox[2], gt_bbox[2]) # x2 of the inter
iymax = min(pred_bbox[3], gt_bbox[3]) # y1 of the inter
iw = np.maximum(ixmax - ixmin + 1., 0.) # 考虑了pre_box outsize gt_box的情况,此时ixmax - ixmin + 1.出现负数,需要回0
ih = np.maximum(iymax - iymin + 1., 0.)
# -----1----- area of intersection
area_inter = iw * ih
# -----2----- area_uni = S1 + S2 - area_inter
area_uni = ((pred_bbox[2] - pred_bbox[0] + 1.) * (pred_bbox[3] - pred_bbox[1] + 1.) +
(gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) -
area_inter)
# -----3----- iou
IOU = area_inter / area_uni
return IOU
举个例子运行一下:
pred_bbox = [45, 55, 110, 135]
gt_bbox = [50, 60, 110, 130]
print(get_IoU_1pre_1gt(pred_bbox, gt_bbox))
计算n(n>=1)个预测框
与1个真实框
之间的IOUs
并得到最大的IOU值
以及最大IOU预测框的索引
的代码如下:
def get_IoUs_npre_1gt(pred_bboxes, gt_bbox):
'''
return (the IOUs, the max IOU, index of the max IOU)
between n pred bboxes (list)[x1, y1, x2, y2] and 1 gt bbox (list)[x1, y1, x2, y2]
'''
import numpy as np
pred_bboxes, gt_bbox = np.array(pred_bboxes), np.array(gt_bbox)
if pred_bboxes.shape[0] > 0:
# -----0---- get [inter1[x1, y1, x2, y2], inter2, ...] and [wh1[w, h], wh2, ...] of inters
ixmin = np.maximum(pred_bboxes[:, 0], gt_bbox[0]) # x1 of the inters
iymin = np.maximum(pred_bboxes[:, 1], gt_bbox[1]) # y1 of the inters
ixmax = np.minimum(pred_bboxes[:, 2], gt_bbox[2]) # x2 of the inters
iymax = np.minimum(pred_bboxes[:, 3], gt_bbox[3]) # y2 of the inters
iw = np.maximum(ixmax - ixmin + 1., 0.) # w of the inters
ih = np.maximum(iymax - iymin + 1., 0.) # h of the inters
# -----1----- intersection
area_inter = iw * ih
# -----2----- area_uni = S1 + S2 - area_inter
area_uni = ((gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) +
(pred_bboxes[:, 2] - pred_bboxes[:, 0] + 1.) * (pred_bboxes[:, 3] - pred_bboxes[:, 1] + 1.) -
area_inter)
# -----3----- iou, get max score and max iou index
IOUs = area_inter / area_uni
max_IOU = np.max(IOUs)
index_max_IOU = np.argmax(IOUs)
return list(IOUs), max_IOU, index_max_IOU
举个例子运行一下:
pred_bboxes = [[0, 0, 45, 55], # outsize
[45, 55, 110, 135], # across
[55, 65, 105, 125]] # inside
gt_bbox = [50, 60, 110, 130]
print(get_IoUs_npre_1gt(pred_bboxes, gt_bbox))
得到了三个IOU,最大的IOU值和最大IOU的索引。
GIOU (Generalized IOU)
是在IOU的基础上进行微调的操作,其具有以下特点:
基于IOU的代码,我们得到计算n(n>=1)个预测框
与1个真实框
之间的GIOUs
并得到最大的GIOU值
以及最大GIOU预测框的索引
的代码,如下:
def get_GIoUs_npre_1gt(pred_bboxes, gt_bbox):
'''
return (the GIOUs, the max GIOU, index of the max GIOU)
between n pred bboxes (list)[x1, y1, x2, y2] and 1 gt bbox (list)[x1, y1, x2, y2]
'''
import numpy as np
pred_bboxes, gt_bbox = np.array(pred_bboxes), np.array(gt_bbox)
if pred_bboxes.shape[0] > 0:
# -----0---- get [inter1[x1, y1, x2, y2], inter2, ...] and [wh1[w, h], wh2, ...] of inters
ixmin = np.maximum(pred_bboxes[:, 0], gt_bbox[0]) # x1 of the inters
iymin = np.maximum(pred_bboxes[:, 1], gt_bbox[1]) # y1 of the inters
ixmax = np.minimum(pred_bboxes[:, 2], gt_bbox[2]) # x2 of the inters
iymax = np.minimum(pred_bboxes[:, 3], gt_bbox[3]) # y2 of the inters
iw = np.maximum(ixmax - ixmin + 1., 0.) # w of the inters
ih = np.maximum(iymax - iymin + 1., 0.) # h of the inters
# -----1----- intersection
area_inter = iw * ih
# -----2----- area_uni = S1 + S2 - area_inter
area_uni = ((gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) +
(pred_bboxes[:, 2] - pred_bboxes[:, 0] + 1.) * (pred_bboxes[:, 3] - pred_bboxes[:, 1] + 1.) -
area_inter)
# -----3----- iou
IOUs = area_inter / area_uni
# -----4----- get the smallest enclosing convex box Cs -> [C1[x1,y1,x2,y2], C2, ...]
cx_min = np.minimum(pred_bboxes[:, 0], gt_bbox[0])
cy_min = np.minimum(pred_bboxes[:, 1], gt_bbox[1])
cx_max = np.maximum(pred_bboxes[:, 2], gt_bbox[2])
cy_max = np.maximum(pred_bboxes[:, 3], gt_bbox[3])
cw = cx_max - cx_min + 1.
ch = cy_max - cy_min + 1.
# -----5----- area of C
area_c = cw * ch
# -----6----- GIOU = IOU - (area of C - area of A∪B) / area of C
GIOUs = IOUs - (area_c - area_uni) / area_c
max_GIOU = np.max(GIOUs)
index_max_GIOU = np.argmax(GIOUs)
return list(GIOUs), max_GIOU, index_max_GIOU
举个例子运行一下:
pred_bboxes = [[0, 0, 45, 55], # outsize
[45, 55, 110, 135], # across
[55, 65, 105, 125]] # inside
gt_bbox = [50, 60, 110, 130]
print(get_GIoUs_npre_1gt(pred_bboxes, gt_bbox))
DIOU (Distance IOU)
取值范围为[-1,1],其将两个框的距离、重叠率以及尺度 都进行了考虑。其公式如下:
D I O U ( A , B ) = I O U ( A , B ) − p 2 ( A , B ) C p 2 DIOU(A,B)=IOU(A,B)-\frac{p^2(A,B)}{C_p^2} DIOU(A,B)=IOU(A,B)−Cp2p2(A,B)
上式中, p 2 ( A , B ) p^2(A,B) p2(A,B)表示A框中点与B框中点的欧式距离, C p C_p Cp表示能包住A与B框的最小矩阵的对角线距离。
同样的,基于GIOU的代码,我们得到计算n(n>=1)个预测框
与1个真实框
之间的DIOUs
并得到最大的DIOU值
以及最大DIOU预测框的索引
的代码,如下:
def get_DIoUs_npre_1gt(pred_bboxes, gt_bbox):
'''
return (the DIOUs, the max DIOU, index of the max DIOU)
between n pred bboxes (list)[x1, y1, x2, y2] and 1 gt bbox (list)[x1, y1, x2, y2]
'''
import numpy as np
pred_bboxes, gt_bbox = np.array(pred_bboxes), np.array(gt_bbox)
if pred_bboxes.shape[0] > 0:
# -----0---- get [inter1[x1, y1, x2, y2], inter2, ...] and [wh1[w, h], wh2, ...] of inters
ixmin = np.maximum(pred_bboxes[:, 0], gt_bbox[0]) # x1 of the inters
iymin = np.maximum(pred_bboxes[:, 1], gt_bbox[1]) # y1 of the inters
ixmax = np.minimum(pred_bboxes[:, 2], gt_bbox[2]) # x2 of the inters
iymax = np.minimum(pred_bboxes[:, 3], gt_bbox[3]) # y2 of the inters
iw = np.maximum(ixmax - ixmin + 1., 0.) # w of the inters
ih = np.maximum(iymax - iymin + 1., 0.) # h of the inters
# -----1----- intersection
area_inter = iw * ih
# -----2----- area_uni = S1 + S2 - area_inter
area_uni = ((gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) +
(pred_bboxes[:, 2] - pred_bboxes[:, 0] + 1.) * (pred_bboxes[:, 3] - pred_bboxes[:, 1] + 1.) -
area_inter)
# -----3----- iou
IOUs = area_inter / area_uni
# -----4----- get 两个框中心点的欧式距离
pre_centerx = (pred_bboxes[:, 0] + pred_bboxes[:, 2]) / 2
pre_centery = (pred_bboxes[:, 1] + pred_bboxes[:, 3]) / 2
gt_centerx = (gt_bbox[0] + gt_bbox[2]) / 2
gt_centery = (gt_bbox[1] + gt_bbox[3]) / 2
p = np.sqrt((pre_centerx - gt_centerx)**2 + (pre_centery - gt_centery)**2)
# -----5----- get the smallest enclosing convex box Cs 的对角线距离C_p
cx_min = np.minimum(pred_bboxes[:, 0], gt_bbox[0])
cy_min = np.minimum(pred_bboxes[:, 1], gt_bbox[1])
cx_max = np.maximum(pred_bboxes[:, 2], gt_bbox[2])
cy_max = np.maximum(pred_bboxes[:, 3], gt_bbox[3])
cw = cx_max - cx_min + 1.
ch = cy_max - cy_min + 1.
C_p = np.sqrt(cw**2 + ch**2)
# -----6----- DIOU = IOU - p**2 / C_p
DIOUs = IOUs - p**2 / C_p**2
max_DIOU = np.max(DIOUs)
index_max_DIOU = np.argmax(DIOUs)
return list(DIOUs), max_DIOU, index_max_DIOU
举个例子运行一下:
pred_bboxes = [[0, 0, 45, 55], # outsize
[45, 55, 110, 135], # across
[55, 65, 105, 125]] # inside
gt_bbox = [50, 60, 110, 130]
print(get_DIoUs_npre_1gt(pred_bboxes, gt_bbox))
CIOU (Complete IOU)
在DIOU的基础上又增加了一个惩罚项,在前者的基础上考虑了长宽比,加快了收敛速度。
C I O U ( A , B ) = I O U ( A , B ) − p 2 ( A , B ) C p 2 − α v CIOU(A,B)=IOU(A,B)-\frac{p^2(A,B)}{C_p^2}-\alpha v CIOU(A,B)=IOU(A,B)−Cp2p2(A,B)−αv
上式中, α = v 1 − I O U + v \alpha = \frac{v}{1-IOU +v} α=1−IOU+vv, v = 4 π 2 ( a r c t a n w p r e h p r e − a r c t a n w g t h g t ) 2 v = \frac{4}{\pi^2}\left(arctan\frac{w_{pre}}{h_{pre}} - arctan\frac{w_{gt}}{h_{gt}} \right)^2 v=π24(arctanhprewpre−arctanhgtwgt)2。
基于DIOU的代码,我们得到计算n(n>=1)个预测框
与1个真实框
之间的CIOUs
并得到最大的CIOU值
以及最大CIOU预测框的索引
的代码,如下:
def get_CIoUs_npre_1gt(pred_bboxes, gt_bbox):
'''
return (the CIOUs, the max CIOU, index of the max CIOU)
between n pred bboxes (list)[x1, y1, x2, y2] and 1 gt bbox (list)[x1, y1, x2, y2]
'''
import numpy as np
pred_bboxes, gt_bbox = np.array(pred_bboxes), np.array(gt_bbox)
if pred_bboxes.shape[0] > 0:
# -----0---- get [inter1[x1, y1, x2, y2], inter2, ...] and [wh1[w, h], wh2, ...] of inters
ixmin = np.maximum(pred_bboxes[:, 0], gt_bbox[0]) # x1 of the inters
iymin = np.maximum(pred_bboxes[:, 1], gt_bbox[1]) # y1 of the inters
ixmax = np.minimum(pred_bboxes[:, 2], gt_bbox[2]) # x2 of the inters
iymax = np.minimum(pred_bboxes[:, 3], gt_bbox[3]) # y2 of the inters
iw = np.maximum(ixmax - ixmin + 1., 0.) # w of the inters
ih = np.maximum(iymax - iymin + 1., 0.) # h of the inters
# -----1----- intersection
area_inter = iw * ih
# -----2----- area_uni = S1 + S2 - area_inter
area_uni = ((gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) +
(pred_bboxes[:, 2] - pred_bboxes[:, 0] + 1.) * (pred_bboxes[:, 3] - pred_bboxes[:, 1] + 1.) -
area_inter)
# -----3----- iou
IOUs = area_inter / area_uni
# -----4----- get 两个框中心点的欧式距离
pre_centerx = (pred_bboxes[:, 0] + pred_bboxes[:, 2]) / 2
pre_centery = (pred_bboxes[:, 1] + pred_bboxes[:, 3]) / 2
gt_centerx = (gt_bbox[0] + gt_bbox[2]) / 2
gt_centery = (gt_bbox[1] + gt_bbox[3]) / 2
p = np.sqrt((pre_centerx - gt_centerx)**2 + (pre_centery - gt_centery)**2)
# -----5----- get the smallest enclosing convex box Cs 的对角线距离C_p
cx_min = np.minimum(pred_bboxes[:, 0], gt_bbox[0])
cy_min = np.minimum(pred_bboxes[:, 1], gt_bbox[1])
cx_max = np.maximum(pred_bboxes[:, 2], gt_bbox[2])
cy_max = np.maximum(pred_bboxes[:, 3], gt_bbox[3])
cw = cx_max - cx_min + 1.
ch = cy_max - cy_min + 1.
C_p = np.sqrt(cw**2 + ch**2)
# -----6----- calculate v
import math
v = 4 / math.pi**2 * (np.arctan((pred_bboxes[:, 2] - pred_bboxes[:, 0]) / (pred_bboxes[:, 3] - pred_bboxes[:, 1])) -
np.arctan((gt_bbox[2] - gt_bbox[0]) / (gt_bbox[3] - gt_bbox[1])))**2
# -----7----- calculate α
alpha = v / (1 - IOUs + v)
# -----6----- DIOU = IOU - p**2 / C_p - αv
CIOUs = IOUs - p**2 / C_p**2 - alpha * v
max_CIOU = np.max(CIOUs)
index_max_CIOU = np.argmax(CIOUs)
return list(CIOUs), max_CIOU, index_max_CIOU
举个例子运行一下:
pred_bboxes = [[0, 0, 45, 55], # outsize
[45, 55, 110, 135], # across
[55, 65, 105, 125]] # inside
gt_bbox = [50, 60, 110, 130]
print(get_CIoUs_npre_1gt(pred_bboxes, gt_bbox))
参考了一些大佬的代码与内容理解。
https://zhuanlan.zhihu.com/p/47189358
https://blog.csdn.net/Flag_ing/article/details/123325828