在Yolo-V4
、Yolo-V5
中,都有一个很重要的技巧,就是Mosaic
数据增强,这种数据增强方式简单来说就是把4张图片,通过随机缩放、随机裁减、随机排布的方式进行拼接。Mosaic
有如下优点:
(1)丰富数据集:随机使用4张图片,随机缩放,再随机分布进行拼接,大大丰富了检测数据集,特别是随机缩放增加了很多小目标,让网络的鲁棒性更好;
(2)减少GPU
显存:直接计算4张图片的数据,使得Mini-batch
大小并不需要很大就可以达到比较好的效果。
思路:随机选择四张图,取其部分拼入该图,如下图所示,四种颜色代表四张样本图,超出的部分将被舍弃。
具体做法如下:
step1:新建mosaic画布,并在mosaic画布上随机生成一个点
im_size = 640
mosaic_border = [-im_size // 2, -im_size // 2]
s_mosaic = im_size * 2
mosaic = np.full((s_mosaic, s_mosaic, 3), 114, dtype=np.uint8)
yc, xc = (int(random.uniform(-x, s_mosaic + x)) for x in mosaic_border)
step2:围绕随机点 (x_c, y_c) 放置4块拼图
(1)左上位置
画布放置区域: (x1a, y1a, x2a, y2a)
case1:图片不超出画布,画布放置区域为 (x_c - w , y_c - h , x_c, y_c)
case2:图片超出画布,画布放置区域为 (0 , 0 , x_c, y_c)
综合case1和case2,画布区域为:
x1a, y1a, x2a, y2a = max(x_c - w, 0), max(y_c - h, 0), x_c, y_c
图片区域 : (x1b, y1b, x2b, y2b)
case1:图片不超出画布,图片不用裁剪,图片区域为 (0 , 0 , w , h)
case2:图片超出画布,超出部分的图片需要裁剪,区域为 (w - x_c , h - y_c , w , h)
综合case1和case2,图片区域为:
x1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, h
(2)右上位置
画布放置区域: (x1a, y1a, x2a, y2a)
case1:图片不超出画布,画布区域为 (x_c , y_c - h , x_c + w , y_c)
case2:图片超出画布,画布区域为 (x_c , 0 , s_mosaic , y_c)
综合case1和case2,画布区域为:
x1a, y1a, x2a, y2a = x_c, max(y_c - h, 0), min(x_c + w, s_mosaic), y_c
图片区域 : (x1b, y1b, x2b, y2b)
case1:图片不超出画布,图片不用裁剪,图片区域为 (0 , 0 , w , h)
case2:图片超出画布,图片需要裁剪,图片区域为 (0 , h - (y2a - y1a) , x2a - x1a , h)
综合case1和case2,图片区域为:
x1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), h
同理可实现左下和右下的拼图。
step3:更新bbox坐标
4张图片的bbox
(n,4),其中n为4张图片中bbox
数量,4代表四个坐标值(xmin,ymin,xmax,ymax) ,加上偏移量得到mosaic bbox
坐标:
def xywhn2xyxy(x, padw=0, padh=0):
# x: bbox坐标 (xmin,ymin,xmax,ymax)
x = np.stack(x)
y = x.clone() if isinstance(x, torch.Tensor) else np.copy(x)
y[:, 0] = x[:, 0] + padw # top left x
y[:, 1] = x[:, 1] + padh # top left y
y[:, 2] = x[:, 2] + padw # bottom right x
y[:, 3] = x[:, 3] + padh # bottom right y
return y
mosaic python 完整实现代码如下:
import cv2
import torch
import random
import os.path
import numpy as np
import matplotlib.pyplot as plt
from camvid import get_bbox, draw_box
def load_mosaic(im_files, name_color_dict):
im_size = 640
s_mosaic = im_size * 2
mosaic_border = [-im_size // 2, -im_size // 2]
labels4, segments4, colors = [], [], []
# mosaic center x, y
y_c, x_c = (int(random.uniform(-x, s_mosaic + x)) for x in mosaic_border)
img4 = np.full((s_mosaic, s_mosaic, 3), 114, dtype=np.uint8)
seg4 = np.full((s_mosaic, s_mosaic), 0, dtype=np.uint8)
for i, im_file in enumerate(im_files):
# Load image
img = cv2.imread(im_file)
seg_file = im_file.replace('images', 'labels')
name = os.path.basename(seg_file).split('.')[0]
seg_file = os.path.join(os.path.dirname(seg_file), name + '_L.png')
seg, boxes, color = get_bbox(seg_file, names, name_color_dict)
colors += color
h, w, _ = np.shape(img)
# place img in img4
if i == 0: # top left
x1a, y1a, x2a, y2a = max(x_c - w, 0), max(y_c - h, 0), x_c, y_c
x1b, y1b, x2b, y2b = w - (x2a - x1a), h - (y2a - y1a), w, h
elif i == 1: # top right
x1a, y1a, x2a, y2a = x_c, max(y_c - h, 0), min(x_c + w, s_mosaic), y_c
x1b, y1b, x2b, y2b = 0, h - (y2a - y1a), min(w, x2a - x1a), h
elif i == 2: # bottom left
x1a, y1a, x2a, y2a = max(x_c - w, 0), y_c, x_c, min(s_mosaic, y_c + h)
x1b, y1b, x2b, y2b = w - (x2a - x1a), 0, w, min(y2a - y1a, h)
elif i == 3: # bottom right
x1a, y1a, x2a, y2a = x_c, y_c, min(x_c + w, s_mosaic), min(s_mosaic, y_c + h)
x1b, y1b, x2b, y2b = 0, 0, min(w, x2a - x1a), min(y2a - y1a, h)
img4[y1a:y2a, x1a:x2a] = img[y1b:y2b, x1b:x2b]
# place seg in seg4
seg4[y1a:y2a, x1a:x2a] = seg[y1b:y2b, x1b:x2b]
# update bbox
padw = x1a - x1b
padh = y1a - y1b
boxes = xywhn2xyxy(boxes, padw=padw, padh=padh)
labels4.append(boxes)
labels4 = np.concatenate(labels4, 0)
for x in labels4[:, 1:]:
np.clip(x, 0, s_mosaic, out=x) # clip coord
# draw result
draw_box(seg4, labels4, colors)
return img4, labels4,seg4
if __name__ == '__main__':
names = ['Pedestrian', 'Car', 'Truck_Bus']
im_files = ['camvid/images/0016E5_01440.png',
'camvid/images/0016E5_06600.png',
'camvid/images/0006R0_f00930.png',
'camvid/images/0006R0_f03390.png']
load_mosaic(im_files, name_color_dict)
YOLOV5: https://github.com/ultralytics/yolov5