目标检测任务的损失函数由 Classificition Loss 和 Bounding Box Regeression Loss 两部分构成。
本文介绍目标检测任务中近几年来Bounding Box Regression Loss Function的演进过程,其演进路线是:
Smooth L1 Loss → \rightarrow → IoU Loss → \rightarrow → GIoU Loss → \rightarrow → DIoU Loss → \rightarrow → CIoU Loss
本文亦按照此路线进行讲解。
由微软rgb大神提出,Fast RCNN论文提出该方法
1) x x x 表示模型的预测值, y y y 表示真实值, z = x − y z = x - y z=x−y 表示预测值与真实值之间的差异
常用的 L1 loss
、L2 Loss
和 smooth L1 loss
定义分别为:
\quad
\quad \quad L L 1 ( x , y ) = ∣ x − y ∣ = ∣ z ∣ L_{L1}(x,y) = |x - y| = |z| LL1(x,y)=∣x−y∣=∣z∣
\quad
\quad \quad L L 2 ( x , y ) = 0.5 ( x − y ) 2 = 0.5 z 2 L_{L2}(x,y) = 0.5(x - y)^2 =0.5z^2 LL2(x,y)=0.5(x−y)2=0.5z2
\quad
\quad \quad L s m o o t h L 1 ( x , y ) = { 0.5 ( x − y ) 2 = 0.5 z 2 , if ∣ x − y ∣ < 1 ∣ x − y ∣ − 0.5 = ∣ z ∣ − 0.5 , otherwise L_{smoothL1}(x,y)=\begin{cases} 0.5(x-y)^2 = 0.5z^2, & \text{if } |x-y|<1 \\ |x-y|-0.5= |z|-0.5, & \text{otherwise} \end{cases} LsmoothL1(x,y)={0.5(x−y)2=0.5z2,∣x−y∣−0.5=∣z∣−0.5,if ∣x−y∣<1otherwise
\quad
上述的 3个损失函数对 z 的导数分别为:
\quad
\quad \quad ∂ L L 1 ( x , y ) ∂ z = { 1 , if z ≥ 0 − 1 , otherwise \frac{\partial{L_{L1}(x,y)}}{\partial z} = \begin{cases} 1, & \text{if } z \geq0 \\ -1, & \text{otherwise} \end{cases} ∂z∂LL1(x,y)={1,−1,if z≥0otherwise
\quad
\quad \quad ∂ L L 2 ( x , y ) ∂ z = z \frac{\partial L_{L2}(x,y)}{\partial z} =z ∂z∂LL2(x,y)=z
\quad
\quad \quad ∂ L s m o o t h L 1 ( x , y ) ∂ z = { z , if ∣ x − y ∣ < 1 ± 1 , otherwise \frac{\partial L_{smoothL1}(x,y)}{\partial z} =\begin{cases} z, & \text{if } |x-y|<1 \\ \pm1, & \text{otherwise} \end{cases} ∂z∂LsmoothL1(x,y)={z,±1,if ∣x−y∣<1otherwise
从损失函数对x的导数可知:
2)实际目标检测框回归任务中的损失 loss 为: L l o c ( t , v ) = ∑ i ∈ ( x , y , w , h ) s m o o t h L 1 ( t i − v i ) L_{loc(t, v)} = \sum_{i \in (x, y, w, h)}smooth_{L1}(t_i - v_i) Lloc(t,v)=∑i∈(x,y,w,h)smoothL1(ti−vi)
即, 分别求 x, y, w, h 的 loss,然后相加作为 Bounding Box Regression Loss
3)缺点
上面的三种Loss用于计算目标检测的Bounding Box Loss时,独立的求出x, y, w, h 的 Loss,然后进行相加得到最终的Bounding Box Loss,这种做法的假设是 x, y, w, h 是相互独立的,实际是有一定相关性的
这就引出了以下的解决方案,使用 IoU 来计算 loss 。
由旷视提出,发表于2016 ACM 论文 《UnitBox: An Advanced Object Detection Network》
2)基于此提出IoU Loss, 其将4个点构成的box看成一个整体进行回归:
由斯坦福学者提出,发表于CVPR2019
1) IoU Loss 有2个缺点:
最后,损失可以用下面的公式来计算: L G I o U = 1 − G I o U L_{GIoU} = 1 - GIoU LGIoU=1−GIoU
\quad
3)C 的面积的求解
假设有两个任意的 bbox A和B,我们要找到一个最小的封闭形状C,让C可以将A和B包围在里面。
A = ( x m i n A , x m a x A , y m i n A , y m a x A ) A = (x_{min}^A, x_{max}^A, y_{min}^A, y_{max}^A) A=(xminA,xmaxA,yminA,ymaxA)
B = ( x m i n B , x m a x B , y m i n B , y m a x B ) B = (x_{min}^B, x_{max}^B, y_{min}^B, y_{max}^B) B=(xminB,xmaxB,yminB,ymaxB)
则, C 的坐标为:
x m i n C = m i n ( x m i n A , x m i n B ) , x m a x C = m a x ( x m a x A , x m a x B ) x_{min}^C = min(x_{min}^A, x_{min}^B), \quad x_{max}^C = max(x_{max}^A, x_{max}^B) xminC=min(xminA,xminB),xmaxC=max(xmaxA,xmaxB)
y m i n C = m i n ( y m i n A , y m i n B ) , y m a x C = m a x ( y m a x A , y m a x B ) y_{min}^C = min(y_{min}^A, y_{min}^B), \quad y_{max}^C = max(y_{max}^A, y_{max}^B) yminC=min(yminA,yminB),ymaxC=max(ymaxA,ymaxB)
C 的面积为:
A r e a C = ( x m a x C − x m i n C ) ∗ ( y m a x C − y m i n C ) Area^C = (x_{max}^C - x_{min}^C) * (y_{max}^C - y_{min}^C) AreaC=(xmaxC−xminC)∗(ymaxC−yminC)
\quad
4)GIoU 相比于 IoU 有如下性质:
GIoU Loss 仍然存在不足:当目标框 完全包裹 预测框的时候,IoU 和 GIoU 的值一样,此情况下 GIoU 退化为 IoU, 无法区分其相对位置关系。
基于IoU和GIoU存在的问题,有两个方向 值得思考:
好的目标框回归损失应该考虑三个重要的几何因素:重叠面积,中心点距离,长宽比。
发表在AAAI 2020
1)CIoU的惩罚项是在DIoU的惩罚项基础上加了一个影响因子 α v \alpha v αv, 这个因子把预测框长宽比拟合目标框的长宽比考虑进去。
R C I o U = ρ 2 ( b , b g t ) c 2 + α v R_{CIoU} = \frac{\rho ^2(b, b^{gt})}{c^2} +\alpha v RCIoU=c2ρ2(b,bgt)+αv, 其中:
α \alpha α 是用于做 trade-off 的参数, α = v ( 1 − I o U ) + v \alpha = \frac{v}{(1-IoU)+v} α=(1−IoU)+vv
v v v是用来衡量长宽比一致性的参数, v = 4 π 2 ( a r c t a n w g t h g t − a r c t a n w h ) 2 v = \frac{4}{\pi^2}(arctan \frac{w^{gt}}{h^{gt}} - arctan \frac{w}{h})^2 v=π24(arctanhgtwgt−arctanhw)2
CIoU Loss function的定义为: L D I o U = 1 − I o U + ρ 2 ( b , b g t ) c 2 + α v L_{DIoU} = 1- IoU + \frac{\rho ^2(b, b^{gt})}{c^2} +\alpha v LDIoU=1−IoU+c2ρ2(b,bgt)+αv
2)DIoU和CIoU的提升效果
(1)Smooth L1 Loss
L s m o o t h L 1 ( x , y ) = { 0.5 ( y ^ − y ) 2 , if ∣ y ^ − y ∣ < 1 ∣ y ^ − y ∣ − 0.5 , otherwise \quad L_{smoothL1}(x,y)=\begin{cases} 0.5(\hat{y}-y)^2, & \text{if } |\hat{y}-y|<1 \\ |\hat{y}-y|-0.5, & \text{otherwise} \end{cases} LsmoothL1(x,y)={0.5(y^−y)2,∣y^−y∣−0.5,if ∣y^−y∣<1otherwise
(2)loU Loss : L o s s = − l n I o U Loss=-ln^{IoU} Loss=−lnIoU
(3)GloU Loss : L G I o U = 1 − I o U + C − ( A ∩ B ) ∣ C ∣ L_{GIoU} = 1 - IoU + \frac{C - (A \cap B)}{|C|} LGIoU=1−IoU+∣C∣C−(A∩B)
(4)DloU Loss: L D I o U = 1 − I o U + ρ 2 ( b , b g t ) c 2 L_{DIoU} = 1- IoU + \frac{\rho ^2(b, b^{gt})}{c^2} LDIoU=1−IoU+c2ρ2(b,bgt)
(5)CIoU Loss : L D I o U = 1 − I o U + ρ 2 ( b , b g t ) c 2 + α v L_{DIoU} = 1- IoU + \frac{\rho ^2(b, b^{gt})}{c^2} +\alpha v LDIoU=1−IoU+c2ρ2(b,bgt)+αv
def bbox_iou(box1, box2, x1y1x2y2=True, GIoU=False, DIoU=False, CIoU=False):
# Returns the IoU of box1 to box2. box1 is 4, box2 is nx4
box2 = box2.t()
# Get the coordinates of bounding boxes
if x1y1x2y2: # x1, y1, x2, y2 = box1
b1_x1, b1_y1, b1_x2, b1_y2 = box1[0], box1[1], box1[2], box1[3]
b2_x1, b2_y1, b2_x2, b2_y2 = box2[0], box2[1], box2[2], box2[3]
else: # transform from xywh to xyxy
b1_x1, b1_x2 = box1[0] - box1[2] / 2, box1[0] + box1[2] / 2
b1_y1, b1_y2 = box1[1] - box1[3] / 2, box1[1] + box1[3] / 2
b2_x1, b2_x2 = box2[0] - box2[2] / 2, box2[0] + box2[2] / 2
b2_y1, b2_y2 = box2[1] - box2[3] / 2, box2[1] + box2[3] / 2
# Intersection area
inter = (torch.min(b1_x2, b2_x2) - torch.max(b1_x1, b2_x1)).clamp(0) * \
(torch.min(b1_y2, b2_y2) - torch.max(b1_y1, b2_y1)).clamp(0)
# Union Area
w1, h1 = b1_x2 - b1_x1, b1_y2 - b1_y1
w2, h2 = b2_x2 - b2_x1, b2_y2 - b2_y1
union = (w1 * h1 + 1e-16) + w2 * h2 - inter
iou = inter / union # iou
if GIoU or DIoU or CIoU:
cw = torch.max(b1_x2, b2_x2) - torch.min(b1_x1, b2_x1) # convex (smallest enclosing box) width
ch = torch.max(b1_y2, b2_y2) - torch.min(b1_y1, b2_y1) # convex height
if GIoU: # Generalized IoU https://arxiv.org/pdf/1902.09630.pdf
c_area = cw * ch + 1e-16 # convex area
return iou - (c_area - union) / c_area # GIoU
if DIoU or CIoU: # Distance or Complete IoU https://arxiv.org/abs/1911.08287v1
# convex diagonal squared
c2 = cw ** 2 + ch ** 2 + 1e-16
# centerpoint distance squared
rho2 = ((b2_x1 + b2_x2) - (b1_x1 + b1_x2)) ** 2 / 4 + ((b2_y1 + b2_y2) - (b1_y1 + b1_y2)) ** 2 / 4
if DIoU:
return iou - rho2 / c2 # DIoU
elif CIoU: # https://github.com/Zzh-tju/DIoU-SSD-pytorch/blob/master/utils/box/box_utils.py#L47
v = (4 / math.pi ** 2) * torch.pow(torch.atan(w2 / h2) - torch.atan(w1 / h1), 2)
with torch.no_grad():
alpha = v / (1 - iou + v)
return iou - (rho2 / c2 + v * alpha) # CIoU
return iou
参考文章:
1、目标检测回归损失函数简介
2、目标检测算法之CVPR2019 GIoU Loss