增加训练样本的多样性,降低模型对某些属性的依赖
, 从而提⾼模型的泛化能⼒,避免过拟合(可以看做一种 隐式正则化)
不同的⽐例出现在图像的不同位置
,这同样能够降低模型对⽬标位置的敏感性
。亮度、对⽐度、饱和度和⾊调
等因素来降低模型对 ⾊彩的敏感度
。后者常用的是
:做 5 次随机剪裁(左上、右上、右下、左下、中间),然后将 5 张图片的预测结果做均值。空间几何变换 未改变图像本身的内容,而是选择了图像的一部分或者像素的空间重分布
alexnet
)resize
至 256 256 256,长边按照短边的缩放比例进行 resize
center crop
(或 top center crop
等)至 256 ∗ 256 256*256 256∗256online random crop
至 224 ∗ 224 224*224 224∗224 后进行模型训练( 32 ∗ 32 32*32 32∗32 种可能)vgg&resnet: scale jittering
)可参考: slim 中的实现
def preprocess(img_dir, save_dir, save_phase):
CROP_SIZE = 128.0
cnt = 0
wrong_cnt = 0
f = codecs.open(save_phase, "w", encoding="utf-8") # save the result
for img_name in os.listdir(img_dir):
try:
img_path = os.path.join(img_dir, img_name)
save_path = os.path.join(save_dir, img_name)
img_label = img_name.split('_')[0]
img_bgr = cv2.imdecode(np.fromfile(img_path, dtype=np.uint8), 1)
# resize the image(min_side size) proportionally
h, w = img_bgr.shape[:2]
if h > w:
resize_ratio = CROP_SIZE / w
h_resized = int(np.round(resize_ratio * h))
w_resized = 128
else:
resize_ratio = CROP_SIZE / h
h_resized = 128
w_resized = int(np.round(resize_ratio * w))
img_bgr_resized = cv2.resize(img_bgr, (w_resized, h_resized), interpolation=cv2.INTER_AREA) # better for resize
# top center crop
if h_resized < w_resized:
off = (w_resized - h_resized) // 2
img_bgr_cropped = img_bgr_resized[:, off:off + h_resized]
else:
img_bgr_cropped = img_bgr_resized[:w_resized, :]
"""
# center crop
if h_resized < w_resized:
off = (w_resized - h_resized) // 2
img_bgr_cropped = img_bgr_resized[:, off:off + h_resized]
else:
off = (h_resized - w_resized) // 2
img_bgr_cropped = img_bgr_resized[off:off + w_resized, :]
"""
cv2.imencode('.jpg', img_bgr_cropped)[1].tofile(save_path)
f.write("{} {}\n".format(save_path, img_label))
cnt += 1
print('Process the {} img'.format(cnt))
except Exception as e:
wrong_cnt += 1
print("Wrong img name is!{}".format(img_name))
print('Error reason is {}'.format(e))
print('Wrong cnt is {}'.format(wrong_cnt))
continue
f.close()
可参考:single crop/multiple crops
resize
至 256 256 256,长边按照短边的缩放比例进行 resize
center crop
(或 top center crop
等)至 256 ∗ 256 256*256 256∗256center crop
至 224 ∗ 224 224*224 224∗224 后进行模型评估或者
:直接找出图像中的短边并将其 resize
至 224 224 224,然后再 center crop
(或 top center crop
等)至 224 ∗ 224 224*224 224∗224进行模型评估resize
至 256 256 256,长边按照短边的缩放比例进行 resize
center crop
(或 top center crop
等)至 256 ∗ 256 256*256 256∗256# 参考 caffe/python/caffe/io.py
def oversample(images, crop_dims):
"""
Crop images into the four corners, center, and their mirrored versions.
Parameters
----------
image : iterable of (H x W x K) ndarrays
crop_dims : (height, width) tuple for the crops.
Returns
-------
crops : (10*N x H x W x K) ndarray of crops for number of inputs N.
"""
# Dimensions and center.
im_shape = np.array(images[0].shape)
crop_dims = np.array(crop_dims)
im_center = im_shape[:2] / 2.0
# Make crop coordinates
h_indices = (0, im_shape[0] - crop_dims[0])
w_indices = (0, im_shape[1] - crop_dims[1])
crops_ix = np.empty((5, 4), dtype=int)
curr = 0
for i in h_indices:
for j in w_indices:
crops_ix[curr] = (i, j, i + crop_dims[0], j + crop_dims[1])
curr += 1
crops_ix[4] = np.tile(im_center, (1, 2)) + np.concatenate([
-crop_dims / 2.0,
crop_dims / 2.0
])
crops_ix = np.tile(crops_ix, (2, 1))
# Extract crops
crops = np.empty((10 * len(images), crop_dims[0], crop_dims[1],
im_shape[-1]), dtype=np.float32)
ix = 0
for im in images:
for crop in crops_ix:
crops[ix] = im[crop[0]:crop[2], crop[1]:crop[3], :]
ix += 1
crops[ix-5:ix] = crops[ix-5:ix, :, ::-1, :] # flip for mirrors
return crops
坐标的变化
首先,将图片从 RGB 颜色空间转换到 HSV 颜色空间
然后,在 HSV 颜色空间随机改变图像原有的饱和度和明度
(即,改变 S 和 V 通道的值)或对色调(Hue)
进行小范围微调
随机增加每个像素的值
来实现if random.randint(2): image += random.uniform(-self.delta, self.delta)
, the default value of delta is 32注意:还可随机改变对比度,组合情况:随机亮度 --> 随机对比度 --> 随机 HSV
or 随机亮度 --> 随机 HSV --> 随机对比度
self.pd = [
RandomContrast(),
ConvertColor(transform='HSV'),
RandomSaturation(),
RandomHue(),
ConvertColor(current='HSV', transform='BGR'),
RandomContrast()
]
im, boxes, labels = self.rand_brightness(im, boxes, labels)
if random.randint(2):
distort = Compose(self.pd[:-1])
else:
distort = Compose(self.pd[1:])
im, boxes, labels = distort(im, boxes, labels)
使用 HSV 来调整图像颜色的原理:
# 适用于对颜色不敏感的任务,以 0.5 的概率随机选择其中一种进行颜色通道变换,总共有 6 中可能
self.perms = ((0, 1, 2), (0, 2, 1),
(1, 0, 2), (1, 2, 0),
(2, 0, 1), (2, 1, 0))
intensities of the RGB channels
in training imagespip install Augmentor
1、Data Augmentation in SSD !!! (Single Shot Detector)
2、https://github.com/tensorflow/models/blob/master/research/object_detection/core/preprocessor.py
3、公输睚信:https://github.com/Shirhe-Lyh/finetune_classification/blob/master/preprocessing.py
4、Application of Synthetic Minority Over-sampling Technique (SMOTe) for imbalanced datasets
5、AutoAugment:Learning Augmentation Strategies from Data【Google Brain】
6、自己之前的总结
7、高手之道】海康威视研究院ImageNet2016竞赛经验分享
8、http://image-net.org/challenges/talks/2016/Hikvision_at_ImageNet_2016.pdf
9、理解卷积神经网络的利器:9篇重要的深度学习论文(上)
10、https://github.com/chuanqi305/MobileNet-SSD
11、一起来看数据增强