一篇较为详细的目标检测各类IOU的代码实现

文章目录

  • 前言
  • 一、IOU
    • 1.1 IOU_1pre_1gt
    • 1.2 IOUs_npre_1gt
  • 二、GIOU
  • 三、DIOU
  • 四、CIOU
  • 参考


前言

本篇文章综合的介绍下各种IOU,主要内容以代码为主,且代码按照公式的逻辑一步步写下来的,方便理解。在阅读过程中可能会发现在各种IOU的实现中有大量重复的代码,YOLOV5 ==> 链接 <== 中提供了def bbox_iou一种简洁统一的IOU的集成代码。
IOU种类主要包括有:IOUGIOUDIOUCIOU


一、IOU

IOU (Intersection over union)交并比,是最为普通的IOU,用来描绘两个框之间的相交程度。用两个框的交集面积除去并集面积,具体公式如下:
I O U ( A , B ) = A ∩ B A ∪ B IOU\left(A, B\right)=\frac{A \cap B}{A \cup B} \quad IOU(A,B)=ABAB
由俭入奢,下面先介绍两个框之间的IOU如何计算,再介绍有一个GT面对多个预测框时的IOU如何计算。

1.1 IOU_1pre_1gt

首先,看下两个框(1个预测框Pre,1个真实框GT)之间的IOU如何计算,代码如下:

def get_IoU_1pre_1gt(pred_bbox, gt_bbox):
    """
    return the iou score
            between 1 pred bbox (list)[x1, y1, x2, y2] and 1 gt bbox (list)[x1, y1, x2, y2]
    """
    import numpy as np
    pred_bbox, gt_bbox = np.array(pred_bbox), np.array(gt_bbox)
    # -----0---- get [x1, y1, x2, y2] and [w, h] of inters
    ixmin = max(pred_bbox[0], gt_bbox[0]) # x1 of the inter
    iymin = max(pred_bbox[1], gt_bbox[1]) # y1 of the inter
    ixmax = min(pred_bbox[2], gt_bbox[2]) # x2 of the inter
    iymax = min(pred_bbox[3], gt_bbox[3]) # y1 of the inter
    iw = np.maximum(ixmax - ixmin + 1., 0.)  # 考虑了pre_box outsize gt_box的情况,此时ixmax - ixmin + 1.出现负数,需要回0
    ih = np.maximum(iymax - iymin + 1., 0.)
    # -----1----- area of intersection
    area_inter = iw * ih
    # -----2----- area_uni = S1 + S2 - area_inter
    area_uni = ((pred_bbox[2] - pred_bbox[0] + 1.) * (pred_bbox[3] - pred_bbox[1] + 1.) +
                (gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) -
                area_inter)
    # -----3----- iou
    IOU = area_inter / area_uni
    return IOU

举个例子运行一下:

pred_bbox = [45, 55, 110, 135]
gt_bbox = [50, 60, 110, 130]
print(get_IoU_1pre_1gt(pred_bbox, gt_bbox))

在这里插入图片描述

1.2 IOUs_npre_1gt

计算n(n>=1)个预测框1个真实框之间的IOUs并得到最大的IOU值以及最大IOU预测框的索引的代码如下:

def get_IoUs_npre_1gt(pred_bboxes, gt_bbox):
    '''
    return (the IOUs, the max IOU, index of the max IOU)
            between n pred bboxes (list)[x1, y1, x2, y2] and 1 gt bbox (list)[x1, y1, x2, y2]
    '''
    import numpy as np
    pred_bboxes, gt_bbox = np.array(pred_bboxes), np.array(gt_bbox)
    if pred_bboxes.shape[0] > 0:
        # -----0---- get [inter1[x1, y1, x2, y2], inter2, ...] and [wh1[w, h], wh2, ...] of inters
        ixmin = np.maximum(pred_bboxes[:, 0], gt_bbox[0]) # x1 of the inters
        iymin = np.maximum(pred_bboxes[:, 1], gt_bbox[1]) # y1 of the inters
        ixmax = np.minimum(pred_bboxes[:, 2], gt_bbox[2]) # x2 of the inters
        iymax = np.minimum(pred_bboxes[:, 3], gt_bbox[3]) # y2 of the inters
        iw = np.maximum(ixmax - ixmin + 1., 0.) # w of the inters
        ih = np.maximum(iymax - iymin + 1., 0.) # h of the inters
        # -----1----- intersection
        area_inter = iw * ih
        # -----2----- area_uni = S1 + S2 - area_inter
        area_uni = ((gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) +
                    (pred_bboxes[:, 2] - pred_bboxes[:, 0] + 1.) * (pred_bboxes[:, 3] - pred_bboxes[:, 1] + 1.) -
                    area_inter)
        # -----3----- iou, get max score and max iou index
        IOUs = area_inter / area_uni
        
        max_IOU = np.max(IOUs)
        index_max_IOU = np.argmax(IOUs)
    return list(IOUs), max_IOU, index_max_IOU

举个例子运行一下:

pred_bboxes = [[0, 0, 45, 55],      # outsize
                [45, 55, 110, 135], # across
                [55, 65, 105, 125]] # inside
gt_bbox = [50, 60, 110, 130]
print(get_IoUs_npre_1gt(pred_bboxes, gt_bbox))

得到了三个IOU,最大的IOU值和最大IOU的索引。
在这里插入图片描述

二、GIOU

GIOU (Generalized IOU)是在IOU的基础上进行微调的操作,其具有以下特点:

  • GIoU(A,B) <= IoU(A,B);
  • GIOU能够反映两个框之间的距离与重叠程度;
  • GIOU取值范围[-1,1]。当两个框无交集且无限远的时候取-1;若两个框无限接近且重合,则取1;
  • GIoU不仅关注两个框之间的重叠区域,还关注非重合区域,能更好的反映两者的重合度。
    ====================================================================
    具体的伪代码如下所示:
    C表示包含了A与B的最小矩形框C\(A∪B)是指C的面积减去A∪B的面积。
    一篇较为详细的目标检测各类IOU的代码实现_第1张图片

基于IOU的代码,我们得到计算n(n>=1)个预测框1个真实框之间的GIOUs并得到最大的GIOU值以及最大GIOU预测框的索引的代码,如下:

def get_GIoUs_npre_1gt(pred_bboxes, gt_bbox):
    '''
    return (the GIOUs, the max GIOU, index of the max GIOU)
            between n pred bboxes (list)[x1, y1, x2, y2] and 1 gt bbox (list)[x1, y1, x2, y2]
    '''
    import numpy as np
    pred_bboxes, gt_bbox = np.array(pred_bboxes), np.array(gt_bbox)
    if pred_bboxes.shape[0] > 0:
        # -----0---- get [inter1[x1, y1, x2, y2], inter2, ...] and [wh1[w, h], wh2, ...] of inters
        ixmin = np.maximum(pred_bboxes[:, 0], gt_bbox[0]) # x1 of the inters
        iymin = np.maximum(pred_bboxes[:, 1], gt_bbox[1]) # y1 of the inters
        ixmax = np.minimum(pred_bboxes[:, 2], gt_bbox[2]) # x2 of the inters
        iymax = np.minimum(pred_bboxes[:, 3], gt_bbox[3]) # y2 of the inters
        iw = np.maximum(ixmax - ixmin + 1., 0.) # w of the inters
        ih = np.maximum(iymax - iymin + 1., 0.) # h of the inters
        # -----1----- intersection
        area_inter = iw * ih
        # -----2----- area_uni = S1 + S2 - area_inter
        area_uni = ((gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) +
                    (pred_bboxes[:, 2] - pred_bboxes[:, 0] + 1.) * (pred_bboxes[:, 3] - pred_bboxes[:, 1] + 1.) -
                    area_inter)
        # -----3----- iou
        IOUs = area_inter / area_uni
        # -----4----- get the smallest enclosing convex box Cs -> [C1[x1,y1,x2,y2], C2, ...]
        cx_min = np.minimum(pred_bboxes[:, 0], gt_bbox[0])
        cy_min = np.minimum(pred_bboxes[:, 1], gt_bbox[1])
        cx_max = np.maximum(pred_bboxes[:, 2], gt_bbox[2])
        cy_max = np.maximum(pred_bboxes[:, 3], gt_bbox[3])
        cw = cx_max - cx_min + 1.
        ch = cy_max - cy_min + 1.
        # -----5----- area of C
        area_c = cw * ch
        # -----6----- GIOU = IOU - (area of C - area of A∪B) / area of C
        GIOUs = IOUs - (area_c - area_uni) / area_c
        
        max_GIOU = np.max(GIOUs)
        index_max_GIOU = np.argmax(GIOUs)
    return list(GIOUs), max_GIOU, index_max_GIOU

举个例子运行一下:

pred_bboxes = [[0, 0, 45, 55],      # outsize
                [45, 55, 110, 135], # across
                [55, 65, 105, 125]] # inside
gt_bbox = [50, 60, 110, 130]
print(get_GIoUs_npre_1gt(pred_bboxes, gt_bbox))

在这里插入图片描述

三、DIOU

DIOU (Distance IOU)取值范围为[-1,1],其将两个框的距离、重叠率以及尺度 都进行了考虑。其公式如下:
D I O U ( A , B ) = I O U ( A , B ) − p 2 ( A , B ) C p 2 DIOU(A,B)=IOU(A,B)-\frac{p^2(A,B)}{C_p^2} DIOU(A,B)=IOU(A,B)Cp2p2(A,B)
上式中, p 2 ( A , B ) p^2(A,B) p2(A,B)表示A框中点与B框中点的欧式距离, C p C_p Cp表示能包住A与B框的最小矩阵的对角线距离。

同样的,基于GIOU的代码,我们得到计算n(n>=1)个预测框1个真实框之间的DIOUs并得到最大的DIOU值以及最大DIOU预测框的索引的代码,如下:

def get_DIoUs_npre_1gt(pred_bboxes, gt_bbox):
    '''
    return (the DIOUs, the max DIOU, index of the max DIOU)
            between n pred bboxes (list)[x1, y1, x2, y2] and 1 gt bbox (list)[x1, y1, x2, y2]
    '''
    import numpy as np
    pred_bboxes, gt_bbox = np.array(pred_bboxes), np.array(gt_bbox)
    if pred_bboxes.shape[0] > 0:
        # -----0---- get [inter1[x1, y1, x2, y2], inter2, ...] and [wh1[w, h], wh2, ...] of inters
        ixmin = np.maximum(pred_bboxes[:, 0], gt_bbox[0]) # x1 of the inters
        iymin = np.maximum(pred_bboxes[:, 1], gt_bbox[1]) # y1 of the inters
        ixmax = np.minimum(pred_bboxes[:, 2], gt_bbox[2]) # x2 of the inters
        iymax = np.minimum(pred_bboxes[:, 3], gt_bbox[3]) # y2 of the inters
        iw = np.maximum(ixmax - ixmin + 1., 0.) # w of the inters
        ih = np.maximum(iymax - iymin + 1., 0.) # h of the inters
        # -----1----- intersection
        area_inter = iw * ih
        # -----2----- area_uni = S1 + S2 - area_inter
        area_uni = ((gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) +
                    (pred_bboxes[:, 2] - pred_bboxes[:, 0] + 1.) * (pred_bboxes[:, 3] - pred_bboxes[:, 1] + 1.) -
                    area_inter)
        # -----3----- iou
        IOUs = area_inter / area_uni
        # -----4----- get 两个框中心点的欧式距离
        pre_centerx = (pred_bboxes[:, 0] + pred_bboxes[:, 2]) / 2
        pre_centery = (pred_bboxes[:, 1] + pred_bboxes[:, 3]) / 2
        gt_centerx = (gt_bbox[0] + gt_bbox[2]) / 2
        gt_centery = (gt_bbox[1] + gt_bbox[3]) / 2
        p = np.sqrt((pre_centerx - gt_centerx)**2 + (pre_centery - gt_centery)**2)
        # -----5----- get the smallest enclosing convex box Cs 的对角线距离C_p
        cx_min = np.minimum(pred_bboxes[:, 0], gt_bbox[0])
        cy_min = np.minimum(pred_bboxes[:, 1], gt_bbox[1])
        cx_max = np.maximum(pred_bboxes[:, 2], gt_bbox[2])
        cy_max = np.maximum(pred_bboxes[:, 3], gt_bbox[3])
        cw = cx_max - cx_min + 1.
        ch = cy_max - cy_min + 1.
        C_p = np.sqrt(cw**2 + ch**2)
        # -----6----- DIOU = IOU - p**2 / C_p
        DIOUs = IOUs - p**2 / C_p**2
        
        max_DIOU = np.max(DIOUs)
        index_max_DIOU = np.argmax(DIOUs)
    return list(DIOUs), max_DIOU, index_max_DIOU

举个例子运行一下:

pred_bboxes = [[0, 0, 45, 55],      # outsize
                [45, 55, 110, 135], # across
                [55, 65, 105, 125]] # inside
gt_bbox = [50, 60, 110, 130]
print(get_DIoUs_npre_1gt(pred_bboxes, gt_bbox))

在这里插入图片描述

四、CIOU

CIOU (Complete IOU)在DIOU的基础上又增加了一个惩罚项,在前者的基础上考虑了长宽比,加快了收敛速度。
C I O U ( A , B ) = I O U ( A , B ) − p 2 ( A , B ) C p 2 − α v CIOU(A,B)=IOU(A,B)-\frac{p^2(A,B)}{C_p^2}-\alpha v CIOU(A,B)=IOU(A,B)Cp2p2(A,B)αv
上式中, α = v 1 − I O U + v \alpha = \frac{v}{1-IOU +v} α=1IOU+vv v = 4 π 2 ( a r c t a n w p r e h p r e − a r c t a n w g t h g t ) 2 v = \frac{4}{\pi^2}\left(arctan\frac{w_{pre}}{h_{pre}} - arctan\frac{w_{gt}}{h_{gt}} \right)^2 v=π24(arctanhprewprearctanhgtwgt)2

基于DIOU的代码,我们得到计算n(n>=1)个预测框1个真实框之间的CIOUs并得到最大的CIOU值以及最大CIOU预测框的索引的代码,如下:

def get_CIoUs_npre_1gt(pred_bboxes, gt_bbox):
    '''
    return (the CIOUs, the max CIOU, index of the max CIOU)
            between n pred bboxes (list)[x1, y1, x2, y2] and 1 gt bbox (list)[x1, y1, x2, y2]
    '''
    import numpy as np
    pred_bboxes, gt_bbox = np.array(pred_bboxes), np.array(gt_bbox)
    if pred_bboxes.shape[0] > 0:
        # -----0---- get [inter1[x1, y1, x2, y2], inter2, ...] and [wh1[w, h], wh2, ...] of inters
        ixmin = np.maximum(pred_bboxes[:, 0], gt_bbox[0]) # x1 of the inters
        iymin = np.maximum(pred_bboxes[:, 1], gt_bbox[1]) # y1 of the inters
        ixmax = np.minimum(pred_bboxes[:, 2], gt_bbox[2]) # x2 of the inters
        iymax = np.minimum(pred_bboxes[:, 3], gt_bbox[3]) # y2 of the inters
        iw = np.maximum(ixmax - ixmin + 1., 0.) # w of the inters
        ih = np.maximum(iymax - iymin + 1., 0.) # h of the inters
        # -----1----- intersection
        area_inter = iw * ih
        # -----2----- area_uni = S1 + S2 - area_inter
        area_uni = ((gt_bbox[2] - gt_bbox[0] + 1.) * (gt_bbox[3] - gt_bbox[1] + 1.) +
                    (pred_bboxes[:, 2] - pred_bboxes[:, 0] + 1.) * (pred_bboxes[:, 3] - pred_bboxes[:, 1] + 1.) -
                    area_inter)
        # -----3----- iou
        IOUs = area_inter / area_uni
        # -----4----- get 两个框中心点的欧式距离
        pre_centerx = (pred_bboxes[:, 0] + pred_bboxes[:, 2]) / 2
        pre_centery = (pred_bboxes[:, 1] + pred_bboxes[:, 3]) / 2
        gt_centerx = (gt_bbox[0] + gt_bbox[2]) / 2
        gt_centery = (gt_bbox[1] + gt_bbox[3]) / 2
        p = np.sqrt((pre_centerx - gt_centerx)**2 + (pre_centery - gt_centery)**2)
        # -----5----- get the smallest enclosing convex box Cs 的对角线距离C_p
        cx_min = np.minimum(pred_bboxes[:, 0], gt_bbox[0])
        cy_min = np.minimum(pred_bboxes[:, 1], gt_bbox[1])
        cx_max = np.maximum(pred_bboxes[:, 2], gt_bbox[2])
        cy_max = np.maximum(pred_bboxes[:, 3], gt_bbox[3])
        cw = cx_max - cx_min + 1.
        ch = cy_max - cy_min + 1.
        C_p = np.sqrt(cw**2 + ch**2)
        # -----6----- calculate v
        import math
        v = 4 / math.pi**2 * (np.arctan((pred_bboxes[:, 2] - pred_bboxes[:, 0]) / (pred_bboxes[:, 3] - pred_bboxes[:, 1])) -
                           np.arctan((gt_bbox[2] - gt_bbox[0]) / (gt_bbox[3] - gt_bbox[1])))**2
        # -----7----- calculate α
        alpha = v / (1 - IOUs + v)
        # -----6----- DIOU = IOU - p**2 / C_p - αv
        CIOUs = IOUs - p**2 / C_p**2 - alpha * v
        
        max_CIOU = np.max(CIOUs)
        index_max_CIOU = np.argmax(CIOUs)
    return list(CIOUs), max_CIOU, index_max_CIOU

举个例子运行一下:

pred_bboxes = [[0, 0, 45, 55],      # outsize
                [45, 55, 110, 135], # across
                [55, 65, 105, 125]] # inside
gt_bbox = [50, 60, 110, 130]
print(get_CIoUs_npre_1gt(pred_bboxes, gt_bbox))

在这里插入图片描述


参考

参考了一些大佬的代码与内容理解。
https://zhuanlan.zhihu.com/p/47189358
https://blog.csdn.net/Flag_ing/article/details/123325828

你可能感兴趣的:(目标检测,深度学习,计算机视觉)