mmdetection中数据增强的可视化

一、PhotoMetricDistortion

函数介绍

class PhotoMetricDistortion:
    """Apply photometric distortion to image sequentially, every transformation
    is applied with a probability of 0.5. The position of random contrast is in
    second or second to last.

    1. random brightness
    2. random contrast (mode 0)
    3. convert color from BGR to HSV
    4. random saturation
    5. random hue
    6. convert color from HSV to BGR
    7. random contrast (mode 1)
    8. randomly swap channels

    Args:
        brightness_delta (int): delta of brightness.
        contrast_range (tuple): range of contrast.
        saturation_range (tuple): range of saturation.
        hue_delta (int): delta of hue.
    """

增强方式使用后的效果展示:

使用前 使用后
mmdetection中数据增强的可视化_第1张图片 mmdetection中数据增强的可视化_第2张图片

二、RandomShift

说明:将图片和目标框,上、下、左、右四个方向,随机移动一定的像素值,这里给出了最大像素值。

函数介绍

class RandomShift:
    """Shift the image and box given shift pixels and probability.

    Args:
        shift_ratio (float): Probability of shifts. Default 0.5.
        max_shift_px (int): The max pixels for shifting. Default 32.
        filter_thr_px (int): The width and height threshold for filtering.
            The bbox and the rest of the targets below the width and
            height threshold will be filtered. Default 1.
    """

    def __init__(self, shift_ratio=0.5, max_shift_px=32, filter_thr_px=1):

增强方式使用后的效果展示:

使用前 使用后
mmdetection中数据增强的可视化_第3张图片 mmdetection中数据增强的可视化_第4张图片

三、RandomAffine

说明:仿射变换可以通过一系列的原子变换的复合来实现,包括:平移(Translation)、缩放(Scale)、旋转(Rotation)和剪切(Shear)。
平移变换是一种“刚体变换”,rigid-body transformation,就是不会产生形变的理想物体。

平移 缩放 剪切 旋转
mmdetection中数据增强的可视化_第5张图片 在这里插入图片描述 mmdetection中数据增强的可视化_第6张图片 mmdetection中数据增强的可视化_第7张图片

函数介绍

class RandomAffine:
    """Random affine transform data augmentation.

    This operation randomly generates affine transform matrix which including
    rotation, translation, shear and scaling transforms.

    Args:
        max_rotate_degree (float): Maximum degrees of rotation transform.
            Default: 10.
        max_translate_ratio (float): Maximum ratio of translation.
            Default: 0.1.
        scaling_ratio_range (tuple[float]): Min and max ratio of
            scaling transform. Default: (0.5, 1.5).
        max_shear_degree (float): Maximum degrees of shear
            transform. Default: 2.
        border (tuple[int]): Distance from height and width sides of input
            image to adjust output shape. Only used in mosaic dataset.
            Default: (0, 0).
        border_val (tuple[int]): Border padding values of 3 channels.
            Default: (114, 114, 114).
        min_bbox_size (float): Width and height threshold to filter bboxes.
            If the height or width of a box is smaller than this value, it
            will be removed. Default: 2.
        min_area_ratio (float): Threshold of area ratio between
            original bboxes and wrapped bboxes. If smaller than this value,
            the box will be removed. Default: 0.2.
        max_aspect_ratio (float): Aspect ratio of width and height
            threshold to filter bboxes. If max(h/w, w/h) larger than this
            value, the box will be removed.
    """

    def __init__(self,
                 max_rotate_degree=10.0, # 最大旋转角度
                 max_translate_ratio=0.1, # 最大平移宽高的比例
                 scaling_ratio_range=(0.5, 1.5),#缩放比例范围
                 max_shear_degree=2.0, #剪切最大度
                 border=(0, 0),
                 border_val=(114, 114, 114),
                 min_bbox_size=2,
                 min_area_ratio=0.2,
                 max_aspect_ratio=20):

增强方式使用后的效果展示:

使用前 使用后
mmdetection中数据增强的可视化_第8张图片 mmdetection中数据增强的可视化_第9张图片

五、corruptions

说明:magecorruptions 模块提供了多种为图片增加噪声的方法。可根据需要选择不同的“corruption_name”,如gaussian_noise,及对图片的处理程度“severity”(可选数值1~5)

函数介绍

class Corrupt:
    """Corruption augmentation.
    Corruption transforms implemented based on
    `imagecorruptions `_.
    Args:
        corruption (str): Corruption name.
        severity (int, optional): The severity of corruption. Default: 1.
        gaussian_noise、shot_noise、impulse_noise
        defocus_blur、glass_blur、motion_blur、zoom_blur
        snow、frost、fog、brightness、contrast
        elastic_transform、pixelate、peg_compression
    """

    def __init__(self, corruption, severity=1):
        self.corruption = corruption
        self.severity = severity

增强方式使用后的效果展示:
以下是corrupt中的"gaussian_noise":

使用前 使用后
mmdetection中数据增强的可视化_第10张图片 mmdetection中数据增强的可视化_第11张图片

六、CutOut

说明:Cutout的出发点和随机擦除一样,也是模拟遮挡,目的是提高泛化能力,实现上比Random Erasing简单,随机选择一个固定大小的正方形区域,然后采用全0填充就OK了,当然为了避免填充0值对训练的影响,应该要对数据进行中心归一化操作,norm到0。需要注意的是作者发现cutout区域的大小比形状重要,所以cutout只要是正方形就行,非常简单。具体操作是利用固定大小的矩形对图像进行遮挡,在矩形范围内,所有的值都被设置为0,或者其他纯色值。

函数介绍:

class CutOut:
    """CutOut operation.
    Randomly drop some regions of image used in
    `Cutout `_.
    Args:
        n_holes (int | tuple[int, int]): Number of regions to be dropped.
            If it is given as a list, number of holes will be randomly
            selected from the closed interval [`n_holes[0]`, `n_holes[1]`].
        cutout_shape (tuple[int, int] | list[tuple[int, int]]): The candidate
            shape of dropped regions. It can be `tuple[int, int]` to use a
            fixed cutout shape, or `list[tuple[int, int]]` to randomly choose
            shape from the list.
        cutout_ratio (tuple[float, float] | list[tuple[float, float]]): The
            candidate ratio of dropped regions. It can be `tuple[float, float]`
            to use a fixed ratio or `list[tuple[float, float]]` to randomly
            choose ratio from the list. Please note that `cutout_shape`
            and `cutout_ratio` cannot be both given at the same time.
        fill_in (tuple[float, float, float] | tuple[int, int, int]): The value
            of pixel to fill in the dropped regions. Default: (0, 0, 0).
    """

    def __init__(self,
                 n_holes,
                 cutout_shape=None,
                 cutout_ratio=None,
                 fill_in=(0, 0, 0)):

增强方式使用后的效果展示:

使用前 使用后
mmdetection中数据增强的可视化_第12张图片 mmdetection中数据增强的可视化_第13张图片

七、MinIoURandomCrop

说明:随机裁剪图像和bbox,裁剪后的patch和原始图片和bbox具有最小IoU,这个IoU阈值从min_ious中随机选择。

函数介绍:

class MinIoURandomCrop:
    """Random crop the image & bboxes, the cropped patches have minimum IoU
    requirement with original image & bboxes, the IoU threshold is randomly
    selected from min_ious.

    Args:
        min_ious (tuple): minimum IoU threshold for all intersections with
        bounding boxes
        min_crop_size (float): minimum crop's size (i.e. h,w := a*h, a*w,
        where a >= min_crop_size).
        bbox_clip_border (bool, optional): Whether clip the objects outside
            the border of the image. Defaults to True.

    Note:
        The keys for bboxes, labels and masks should be paired. That is, \
        `gt_bboxes` corresponds to `gt_labels` and `gt_masks`, and \
        `gt_bboxes_ignore` to `gt_labels_ignore` and `gt_masks_ignore`.
    """

    def __init__(self,
                 min_ious=(0.1, 0.3, 0.5, 0.7, 0.9),
                 min_crop_size=0.3,
                 bbox_clip_border=True):

增强方式使用后的效果展示:

使用前 使用后
mmdetection中数据增强的可视化_第14张图片 mmdetection中数据增强的可视化_第15张图片

八、Expand

说明:将图片长宽扩大一定的比例,然后将原图放进去,扩大后的其他区域用均值填充。这种方法可以将大目标变成小目标,增加小目标的样本数量。

函数介绍:

class Expand:
    """Random expand the image & bboxes.

    Randomly place the original image on a canvas of 'ratio' x original image
    size filled with mean values. The ratio is in the range of ratio**增强方式使用后的效果展示:**
_range.

    Args:
        mean (tuple): mean value of dataset.
        to_rgb (bool): if need to convert the order of mean to align with RGB.
        ratio_range (tuple): range of expand ratio.
        prob (float): probability of applying this transformation
    """

    def __init__(self,
                 mean=(0, 0, 0),  #用mean填充扩充后的其他区域
                 to_rgb=True,
                 ratio_range=(1, 4), # expand原图的几倍
                 seg_ignore_label=None,
                 prob=0.5):

增强方式使用后的效果展示:

使用前 使用后
mmdetection中数据增强的可视化_第16张图片 mmdetection中数据增强的可视化_第17张图片

你可能感兴趣的:(目标检测,人工智能)