版权声明:本文为博主原创文章,未经博主允许不得转载。https://blog.csdn.net/weixin_44474718/article/details/86514697
在keras中通过类ImageDataGenerator
来实现图像增强处理的功能。
功能包括:
创建配置ImageDataGenerator
之后,通过fit()函数使其适用于数据集,这个过程计算实际执行图像数据转换中所需的全部统计信息。
原始图像:
from keras.datasets import mnist
from matplotlib import pyplot as plt
# 从Keras导入Mnist数据集
(X_train, y_train), (X_validation, y_validation) = mnist.load_data()
# 显示9张手写数字的图片
for i in range(0, 9):
plt.subplot(331 + i)
plt.imshow(X_train[i], cmap=plt.get_cmap('gray'))
plt.show()
对整个图像数据集的像素进行标准化,被称为特征标准化。能提高神经网络算法的性能。
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28).astype('float32')
解释:训练数据集被构造为包括图像长宽的三维数组。对于多层感知器模型,我们必须将图像降维为像素矢量。在这种情况下,28×28
大小的图像将变为784个像素的输入值。
我们可以使用NumPy数组上的reshape()函数轻松完成这个转换。我们还可以通过强制像素值的灰度值降低到为32('float32
)位来减少内存需求,原因之一是这是Keras默认的灰度值精度。
from keras.datasets import mnist
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot as plt
from keras import backend
backend.set_image_data_format('channels_first') #设置数据格式约定的值。
# 导入数据
(X_train, y_train), (X_validation, y_validation) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28).astype('float32')
X_validation = X_validation.reshape(X_validation.shape[0], 1, 28, 28).astype('float32')
# 图像特征化
imgGen = ImageDataGenerator(featurewise_center=True, featurewise_std_normalization=True)
imgGen.fit(X_train)
for X_batch, y_batch in imgGen.flow(X_train, y_train, batch_size=9):
for i in range(0, 9):
plt.subplot(331 + i)
plt.imshow(X_batch[i].reshape(28, 28), cmap=plt.get_cmap('gray'))
plt.show()
break
图像的白化处理是线性代数操作,能够减少图像像素矩阵的冗余和相关性。也就是更好的突出结构和特征给学习算法!两种比较流行的:主成分分析(PCA)和ZCA白化。ZCA白化转换后图像保持原始尺寸,被证明具有更好的适用性。
from keras.datasets import mnist
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot as plt
from keras import backend
backend.set_image_data_format('channels_first')
# 导入数据
(X_train, y_train), (X_validation, y_validation) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28).astype('float32')
X_validation = X_validation.reshape(X_validation.shape[0], 1, 28, 28).astype('float32')
# ZCA白化
imgGen = ImageDataGenerator(zca_whitening=True)
imgGen.fit(X_train)
#flow 采集数据和标签数组,生成批量增强数据。
for X_batch, y_batch in imgGen.flow(X_train, y_train, batch_size=9):
for i in range(0, 9):
plt.subplot(331 + i)
plt.imshow(X_batch[i].reshape(28, 28), cmap=plt.get_cmap('gray'))
plt.show()
break
对于样本不足,类似于样本增强?
from keras.datasets import mnist
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot as plt
from keras import backend
backend.set_image_data_format('channels_first')
# 导入数据
(X_train, y_train), (X_validation, y_validation) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28).astype('float32')
X_validation = X_validation.reshape(X_validation.shape[0], 1, 28, 28).astype('float32')
# 图像旋转
imgGen = ImageDataGenerator(rotation_range=90)
imgGen.fit(X_train)
for X_batch, y_batch in imgGen.flow(X_train, y_train, batch_size=9):
for i in range(0, 9):
plt.subplot(331 + i)
plt.imshow(X_batch[i].reshape(28, 28), cmap=plt.get_cmap('gray'))
plt.show()
break
# 图像移动
imgGen = ImageDataGenerator(width_shift_range=0.2, height_shift_range=0.2)
imgGen.fit(X_train)
for X_batch, y_batch in imgGen.flow(X_train, y_train, batch_size=9):
for i in range(0, 9):
plt.subplot(331 + i)
plt.imshow(X_batch[i].reshape(28, 28), cmap=plt.get_cmap('gray'))
plt.show()
break
# 图像剪切
imgGen = ImageDataGenerator(shear_range=0.2)
imgGen.fit(X_train)
for X_batch, y_batch in imgGen.flow(X_train, y_train, batch_size=9):
for i in range(0, 9):
plt.subplot(331 + i)
plt.imshow(X_batch[i].reshape(28, 28), cmap=plt.get_cmap('gray'))
plt.show()
break
# 图像反转
imgGen = ImageDataGenerator(horizontal_flip=True, vertical_flip=True)
imgGen.fit(X_train)
for X_batch, y_batch in imgGen.flow(X_train, y_train, batch_size=9):
for i in range(0, 9):
plt.subplot(331 + i)
plt.imshow(X_batch[i].reshape(28, 28), cmap=plt.get_cmap('gray'))
plt.show()
break
使用keras可以实时的生成增强图像来训练模型,但是,当同时在多个模型的训练过程中使用这些增强图像时,若每次都通过实时生成,时间开销会非常大。因此,可以在训练的过程中将生成的图像保存到文件。
from keras.datasets import mnist
from keras.preprocessing.image import ImageDataGenerator
from matplotlib import pyplot as plt
from keras import backend
import os
backend.set_image_data_format('channels_first')
# 导入数据
(X_train, y_train), (X_validation, y_validation) = mnist.load_data()
X_train = X_train.reshape(X_train.shape[0], 1, 28, 28).astype('float32')
X_validation = X_validation.reshape(X_validation.shape[0], 1, 28, 28).astype('float32')
# ZCA白化
imgGen = ImageDataGenerator(zca_whitening=True)
imgGen.fit(X_train)
# 创建目录,并保存图像
try:
os.mkdir('image')
except:
print('The fold is exist!')
for X_batch, y_batch in imgGen.flow(X_train, y_train, batch_size=9, save_to_dir='image', save_prefix='oct',
save_format='png'):
for i in range(0, 9):
plt.subplot(331 + i)
plt.imshow(X_batch[i].reshape(28, 28), cmap=plt.get_cmap('gray'))
plt.show()
break