在训练神经网络的时候,经常需要对原始图像做各种各样的增强来增加数据量,最常见的也就是旋转和翻转操作了,实现这两种操作也多种多样,本博客就是来探究不同操作带来的结果
本文所有的实验都是基于2维的图像,即2维数组
原始图为(https://baike.baidu.com/item/%E5%94%90%E8%80%81%E9%B8%AD/4344419?fr=aladdin):
flip适用于所有的数组翻转,而flipud和fliplr一般用于图像(2维数组)的翻转,前者是对图像进行上下翻转,后者是左右翻转
参考:https://www.cnblogs.com/xiaoniu-666/p/11123560.html
import numpy as np
img = np.flip(img) # 翻转所有维度
img = np.flip(img, n) # 翻转第n个维度
img = np.flip(img, (m,n,...)) # 同时翻转指定的多个维度
ud = up/down
上下翻转也就是沿着x轴翻转,在数组中是沿着第0维翻转
import numpy as np
img = np.flipud(img) # 上下翻转
# 等同于: img = np.flip(img, 0)
# 等同于: img = img[::-1, :]
实验:
import numpy as np
from PIL import Image
img = np.asarray(Image.open('./tang.png').convert('L')) # 读取图像并转成灰度图
img_pad = np.zeros((500, 500), dtype=np.uint8)
img_pad[:496, :] = img # 原始图像大小为496x500,将其padding为500x500的方图
img1 = np.flip(img_pad, 0)
img2 = np.flipud(img_pad)
img3 = img_pad[::-1, :]
img_cat = np.concatenate([img_pad, img1, img2, img3], axis=1)
Image.fromarray(img_cat).save('./tang_flipud.png')
lr = left/right
左右翻转也就是沿着y轴翻转,在数组中是沿着第1维翻转
import numpy as np
img = np.fliplr(img) # 左右翻转
# 等同于: img = np.flip(img, 1)
# 等同于: img = img[:, ::-1]
实验:
img1 = np.flip(img_pad, 1)
img2 = np.fliplr(img_pad)
img3 = img_pad[:, ::-1]
img_cat = np.concatenate([img_pad, img1, img2, img3], axis=1)
Image.fromarray(img_cat).save('./tang_fliplr.png')
数学上叫转置,在数组上就是交换坐标轴,在图像上来看就是沿着对角线翻转
这种变换不是通过一次上下翻转和一次左右翻转可以得到的!
实验:
img1 = np.transpose(img_pad, (1, 0))
img_cat = np.concatenate([img_pad, img1], axis=1)
Image.fromarray(img_cat).save('./tang_transpose.png')
import numpy as np
img = np.rot90(img, n) # n=0,1,2,3,... 即旋转0,90,180,270,
# 如果n>=4, 就取余数来确定旋转的度数
# 正数代表逆时针旋转,负数代表顺时针旋转
实验:
img1 = np.rot90(img_pad, 1)
img2 = np.rot90(img_pad, 2)
img3 = np.rot90(img_pad, 3)
img_cat = np.concatenate([img_pad, img1, img2, img3], axis=1)
Image.fromarray(img_cat).save('./tang_rot90.png')
2维图像通过翻转和旋转可以得到8种不同的组合结果,如何得到这8种组合结果呢?
我一般通过以下两种方式:
img1 = img_pad
img2 = img_pad[::-1, :] # flipud(img_pad)
img3 = img_pad[:, ::-1] # fliplr(img_pad)
img4 = img_pad[::-1, ::-1] # fliplr(flipud(img_pad))
img5 = np.transpose(img_pad, (1, 0))
img6 = img5[::-1, :]
img7 = img5[:, ::-1]
img8 = img5[::-1, ::-1]
img_cat1 = np.concatenate([img1,img2,img3,img4], axis=1)
img_cat2 = np.concatenate([img5,img6,img7,img8], axis=1)
img_cat = np.concatenate([img_cat1, img_cat2], axis=0)
Image.fromarray(img_cat).save('./tang_aug1.png')
img1 = img_pad
img2 = np.rot90(img_pad, 1)
img3 = np.rot90(img_pad, 2)
img4 = np.rot90(img_pad, 3)
img5 = np.transpose(img_pad, (1, 0))
img6 = np.rot90(img5, 1)
img7 = np.rot90(img5, 2)
img8 = np.rot90(img5, 3)
img_cat1 = np.concatenate([img1,img2,img3,img4], axis=1)
img_cat2 = np.concatenate([img5,img6,img7,img8], axis=1)
img_cat = np.concatenate([img_cat1, img_cat2], axis=0)
Image.fromarray(img_cat).save('./tang_aug2.png')
两种方式的生成结果是完全一样(顺序有点不同)
通过对比也可以发现:
1)上下翻转 = 对角线翻转+逆时针旋转90度
2)左右翻转 = 对角线翻转+顺时针旋转90度
import numpy as np
from PIL import Image
img = np.asarray(Image.open('./tang.png').convert('L')) # 读取图像并转成灰度图
img_pad = np.zeros((500, 500), dtype=np.uint8)
img_pad[:496, :] = img # 原始图像大小为496x500,将其padding为500x500的方图
img1 = np.flip(img_pad, 0)
img2 = np.flipud(img_pad)
img3 = img_pad[::-1, :]
img_cat = np.concatenate([img_pad, img1, img2, img3], axis=1)
Image.fromarray(img_cat).save('./tang_flipud.png')
img1 = np.flip(img_pad, 1)
img2 = np.fliplr(img_pad)
img3 = img_pad[:, ::-1]
img_cat = np.concatenate([img_pad, img1, img2, img3], axis=1)
Image.fromarray(img_cat).save('./tang_fliplr.png')
img1 = np.transpose(img_pad, (1, 0))
img_cat = np.concatenate([img_pad, img1], axis=1)
Image.fromarray(img_cat).save('./tang_transpose.png')
img1 = np.rot90(img_pad, 1)
img2 = np.rot90(img_pad, 2)
img3 = np.rot90(img_pad, 3)
img_cat = np.concatenate([img_pad, img1, img2, img3], axis=1)
Image.fromarray(img_cat).save('./tang_rot90.png')
img1 = img_pad
img2 = img_pad[::-1, :] # flipud(img_pad)
img3 = img_pad[:, ::-1] # fliplr(img_pad)
img4 = img_pad[::-1, ::-1] # fliplr(flipud(img_pad))
img5 = np.transpose(img_pad, (1, 0))
img6 = img5[::-1, :]
img7 = img5[:, ::-1]
img8 = img5[::-1, ::-1]
img_cat1 = np.concatenate([img1,img2,img3,img4], axis=1)
img_cat2 = np.concatenate([img5,img6,img7,img8], axis=1)
img_cat = np.concatenate([img_cat1, img_cat2], axis=0)
Image.fromarray(img_cat).save('./tang_aug1.png')
img1 = img_pad
img2 = np.rot90(img_pad, 1)
img3 = np.rot90(img_pad, 2)
img4 = np.rot90(img_pad, 3)
img5 = np.transpose(img_pad, (1, 0))
img6 = np.rot90(img5, 1)
img7 = np.rot90(img5, 2)
img8 = np.rot90(img5, 3)
img_cat1 = np.concatenate([img1,img2,img3,img4], axis=1)
img_cat2 = np.concatenate([img5,img6,img7,img8], axis=1)
img_cat = np.concatenate([img_cat1, img_cat2], axis=0)
Image.fromarray(img_cat).save('./tang_aug2.png')