yolov5 自适应padding

yolo系列输入尺寸大多是416*416,608*608这种正方形尺寸的。图像预处理中通常先把图像padding成正方形然后在reshape到指定尺寸,但是这样会在图像中产生很大一部分无效区域。在yolov5中提出了一种自适应padding方法,在预处理过程中对原始图像自适应地填充最少的黑边。

先看源码

def letterbox(im, new_shape=(640, 640), color=(114, 114, 114), auto=True, scaleFill=False, scaleup=True, stride=32):
    # Resize and pad image while meeting stride-multiple constraints
    shape = im.shape[:2]  # current shape [height, width]
    if isinstance(new_shape, int):
        new_shape = (new_shape, new_shape)

    # Scale ratio (new / old)
    r = min(new_shape[0] / shape[0], new_shape[1] / shape[1])
    if not scaleup:  # only scale down, do not scale up (for better val mAP)
        r = min(r, 1.0)

    # Compute padding
    ratio = r, r  # width, height ratios
    new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r))
    dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1]  # wh padding
    if auto:  # minimum rectangle
        dw, dh = np.mod(dw, stride), np.mod(dh, stride)  # wh padding
    elif scaleFill:  # stretch
        dw, dh = 0.0, 0.0
        new_unpad = (new_shape[1], new_shape[0])
        ratio = new_shape[1] / shape[1], new_shape[0] / shape[0]  # width, height ratios

    dw /= 2  # divide padding into 2 sides
    dh /= 2

    if shape[::-1] != new_unpad:  # resize
        im = cv2.resize(im, new_unpad, interpolation=cv2.INTER_LINEAR)
    top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1))
    left, right = int(round(dw - 0.1)), int(round(dw + 0.1))
    im = cv2.copyMakeBorder(im, top, bottom, left, right, cv2.BORDER_CONSTANT, value=color)  # add border
    return im, ratio, (dw, dh)

流程为:

  1. 分别求出长宽的缩放比例,并找到最小值;
  2. 按照1中的最小缩放比例对图像个相同性缩放;
  3. padding到想要的尺寸。

其实就是先resize再padding,这样填充黑色比较少。据说,这样的图像预处理方法据说使得infer快了37%。

当然,也可以不考虑形变直接resize到想要的尺寸。

你可能感兴趣的:(pytorch,深度学习-经典网络,深度学习,目标检测,视觉检测)