He K., Zhang X., Ren S. and Sun J. Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
即:
import torchvision.transforms as T
T.Compose([
T.Pad(4, padding_mode='reflect'),
T.RandomCrop(32), # for CIFAR-10 and CIFAR-100
T.RandomHorizontalFlip(),
T.ToTensor()
])
注: T.RandomCrop(200)
DeVries T. and Taylor G. W. Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552, 2017
原文代码
形象点说, 就是给图片挖几个洞(不过作者的解释挺有趣的, 输入层的连续dropout, 这个想法有趣).
# uoguelph-mlrg cutout
# https://github.com/uoguelph-mlrg/Cutout/blob/master/util/cutout.py
# 原文代码
import torch
import numpy as np
class Cutout(object):
"""Randomly mask out one or more patches from an image.
Args:
n_holes (int): Number of patches to cut out of each image.
length (int): The length (in pixels) of each square patch.
"""
def __init__(self, n_holes, length):
self.n_holes = n_holes
self.length = length
def __call__(self, img):
"""
Args:
img (Tensor): Tensor image of size (C, H, W).
Returns:
Tensor: Image with n_holes of dimension length x length cut out of it.
"""
h = img.size(1)
w = img.size(2)
mask = np.ones((h, w), np.float32)
for n in range(self.n_holes):
y = np.random.randint(h)
x = np.random.randint(w)
y1 = np.clip(y - self.length // 2, 0, h)
y2 = np.clip(y + self.length // 2, 0, h)
x1 = np.clip(x - self.length // 2, 0, w)
x2 = np.clip(x + self.length // 2, 0, w)
mask[y1: y2, x1: x2] = 0.
mask = torch.from_numpy(mask)
mask = mask.expand_as(img)
img = img * mask
return img
Zhang H., Cisse M., Dauphin Y. N. and Lopez-Paz D. mixup: Beyond empirical risk minimization. In International Conference on Learning Representations (ICLR), 2018.
代码
MixUp 就是将两个图片按照一定比例相加
x ~ = λ x A + ( 1 − λ ) x B , \tilde{x} = \lambda x_A + (1 - \lambda) x_B, x~=λxA+(1−λ)xB,
λ ∼ B e t a ( α , α ) , α ∈ ( 0 , + ∞ ) \lambda \sim \mathrm{Beta}(\alpha, \alpha), \alpha \in (0, +\infty) λ∼Beta(α,α),α∈(0,+∞), 同时
y ~ = λ y A + ( 1 − λ ) y B . \tilde{y} = \lambda y_A + (1 - \lambda) y_B. y~=λyA+(1−λ)yB.
Yun S., Han D., Oh S. J., Chun S., Choe J. and Yoo Y. Cutmix: Regularization strategy to train strong classifiers with localizable features. In International Conference on Vision (ICCV), 2019.
原文代码
CutMix 就是将一个图片裁剪掉一块, 然后用另一张图片的对应位置填补:
x ~ = M ∘ x A + ( 1 − M ) ∘ x B , \tilde{x} = M \circ x_A + (1 - M) \circ x_B, x~=M∘xA+(1−M)∘xB,
其中 M ∈ { 0 , 1 } H × W M \in \{0, 1\}^{H \times W} M∈{0,1}H×W 是掩码矩阵, 按照如下方式生成:
r x ∼ U [ 0 , W ] , r y = U [ 0 , H ] , r w = W 1 − λ , r h = H 1 − λ , r_x \sim U[0, W], \: r_y = U[0, H], \\ r_w = W\sqrt{1 - \lambda} , \: r_h = H\sqrt{1 - \lambda}, rx∼U[0,W],ry=U[0,H],rw=W1−λ,rh=H1−λ,
得到一个bounding box, 在其中的 M i j = 0 M_{ij}= 0 Mij=0, 否则为 1 1 1, λ \lambda λ控制框的大小的:
r w r h H W = 1 − λ . \frac{r_w r_h}{HW} = 1 - \lambda. HWrwrh=1−λ.
此时, 标签需要发生相应变化:
y ~ = λ y A + ( 1 − λ ) y B , \tilde{y} = \lambda y_{A} + (1 - \lambda)y_B, y~=λyA+(1−λ)yB,
这里标签是one-hot形式(或者概率向量).
AugMix
Cubuk E. D., Zoph B., Man’{e}, D., Vasudevan V. and Le Q. V. AutoAugment: learning augmentation strategies from data. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
根据数据集选择合理的augmentations策略.
Cubuk E. D., Zoph B., Shlens J. and Le Q. V. RandAugment: practical automated data augmentation with a reduced search space.
从 K K K中augmentations中随机选择 N N N(replace=True), 每个的magnitude都是 M M M.
其中 M , N M, N M,N是通过网格搜索找到的(太粗暴了吧)
Hendrycks D., Basart S., Mu N., Kadavath S., Wang F., Dorundo E., Desai R., Zhu T., Parajuli S., Guo M., Song D., Steinhardt J. Gilmer J. The many faces of robustness: a critical analysis of out-of-distribution generalization. arXiv preprint arXiv:2006.16241, 2020.
假设 f ( ⋅ ; θ ) f(\cdot;\theta) f(⋅;θ)是一个Image-to-Image的网络, 则DeepAugment是指对 θ \theta θ添加扰动, 比如扰动权重, 改变激活函数等等.
和这个思想类似的还有AdversarialAugment.