- 本文为365天深度学习训练营 中的学习记录博客
- 参考文章:365天深度学习训练营-第10周:数据增强(训练营内部成员可读)
- 原作者:K同学啊|接辅导、项目定制
在本教程中,你将学会如何进行数据增强,并通过数据增强用少量数据达到非常非常棒的识别准确率。
我将展示两种数据增强方式,以及如何自定义数据增强方式并将其放到我们代码当中,两种数据增强方式如下:
● 将数据增强模块嵌入model中
● 在Dataset数据集中进行数据增强
import matplotlib.pyplot as plt
import numpy as np
#隐藏警告
import warnings
warnings.filterwarnings('ignore')
from tensorflow.keras import layers
import tensorflow as tf
gpus = tf.config.list_physical_devices("GPU")
if gpus:
tf.config.experimental.set_memory_growth(gpus[0], True) #设置GPU显存用量按需使用
tf.config.set_visible_devices([gpus[0]],"GPU")
# 打印显卡信息,确认GPU可用
print(gpus)
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2022-09-30 07:53:43.264919: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-09-30 07:53:43.378072: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-09-30 07:53:43.378762: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
data_dir = "/home/mw/input/dogcat3675/365-7-data/"
img_height = 224
img_width = 224
batch_size = 32
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.3,
subset="training",
seed=12,
image_size=(img_height, img_width),
batch_size=batch_size)
Found 3400 files belonging to 2 classes.
Using 2380 files for training.
2022-09-30 07:53:50.326136: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-09-30 07:53:50.326592: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-09-30 07:53:50.327241: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-09-30 07:53:50.327790: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-09-30 07:53:51.440324: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-09-30 07:53:51.440999: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-09-30 07:53:51.441559: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:936] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-09-30 07:53:51.442078: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13797 MB memory: -> device: 0, name: Tesla T4, pci bus id: 0000:00:1e.0, compute capability: 7.5
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.3,
subset="validation",
seed=12,
image_size=(img_height, img_width),
batch_size=batch_size)
Found 3400 files belonging to 2 classes.
Using 1020 files for validation.
由于原始数据集不包含测试集,因此需要创建一个。使用 tf.data.experimental.cardinality 确定验证集中有多少批次的数据,然后将其中的 20% 移至测试集。
val_batches = tf.data.experimental.cardinality(val_ds)
test_ds = val_ds.take(val_batches // 5)
val_ds = val_ds.skip(val_batches // 5)
print('Number of validation batches: %d' % tf.data.experimental.cardinality(val_ds))
print('Number of test batches: %d' % tf.data.experimental.cardinality(test_ds))
Number of validation batches: 26
Number of test batches: 6
一共有猫狗两类
class_names = train_ds.class_names
print(class_names)
['cat', 'dog']
AUTOTUNE = tf.data.experimental.AUTOTUNE
def preprocess_image(image,label):
return (image/255.0,label)
# 归一化处理
train_ds = train_ds.map(preprocess_image, num_parallel_calls=AUTOTUNE)
val_ds = val_ds.map(preprocess_image, num_parallel_calls=AUTOTUNE)
test_ds = test_ds.map(preprocess_image, num_parallel_calls=AUTOTUNE)
train_ds = train_ds.cache().prefetch(buffer_size=AUTOTUNE)
val_ds = val_ds.cache().prefetch(buffer_size=AUTOTUNE)
plt.figure(figsize=(15, 10)) # 图形的宽为15高为10
for images, labels in train_ds.take(1):
for i in range(8):
ax = plt.subplot(5, 8, i + 1)
plt.imshow(images[i])
plt.title(class_names[labels[i]])
plt.axis("off")
2022-09-30 07:55:08.306682: W tensorflow/core/kernels/data/cache_dataset_ops.cc:768] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
我们可以使用 tf.keras.layers.experimental.preprocessing.RandomFlip 与 tf.keras.layers.exper
imental.preprocessing.RandomRotation 进行数据增强
● tf.keras.layers.experimental.preprocessing.RandomFlip:水平和垂直随机翻转每个图像。
● tf.keras.layers.experimental.preprocessing.RandomRotation:随机旋转每个图像
data_augmentation = tf.keras.Sequential([
tf.keras.layers.experimental.preprocessing.RandomFlip("horizontal_and_vertical"),
tf.keras.layers.experimental.preprocessing.RandomRotation(0.2),
])
第一个层表示进行随机的水平和垂直翻转,而第二个层表示按照 0.2 的弧度值进行随机旋转。
# Add the image to a batch.
image = tf.expand_dims(images[i], 0)
tensorflow支持的数据增强的方法主要有两种:
1.使用tf.image模块中提供的方法.
2.使用keras中的预处理层,比如tf.keras.layers.RandomFlip,tf.keras.layers.Resizing等
为了方便观察数据增强API的效果,我们定义一个函数,用来对比原始图片和增强后的图片。
def visualize(original, augmented):
fig = plt.figure()
plt.subplot(1,2,1)
plt.title('Original image')
plt.imshow(original)
plt.subplot(1,2,2)
plt.title('Augmented image')
plt.imshow(augmented)
flipped = tf.image.flip_left_right(image)
visualize(image, flipped)
tf.image.adjust_saturation函数需要传入图片和饱和度参数(3)
saturated = tf.image.adjust_saturation(image, 3)
visualize(image, saturated)
在tensorflow2.6版本中,tf.image含有两种随机增强api: tf.image.random 和 tf.image.stateless_random 官方更推荐使用后者
函数说明:tf.image.stateless_random_brightness需要传入三个参数
image:图像数组
max_delta:亮度参数,指定后亮度变换范围为[−max_delta,max_delta)
seed:seed为随机种子,为两元素的元组类型。相同种子将会产生相同的变化规则。
for i in range(3):
seed = (i, 0) # tuple of size (2,)
stateless_random_brightness = tf.image.stateless_random_brightness(
image, max_delta=0.95, seed=seed)
visualize(image, stateless_random_brightness)
函数说明:tf.image.stateless_random_crop中size为指定crop图像的大小。seed仍为size为(2,)的元组。
for i in range(3):
seed = (i, 0) # tuple of size (2,)
stateless_random_crop = tf.image.stateless_random_crop(
image, size=[210, 300, 3], seed=seed)
visualize(image, stateless_random_crop)
函数说明:
tf.image.resize_with_crop_or_pad:对图片进行resize,该函数相较于tf.image.resize会根据原始图片比例适当增加padding或者crop后再进行resize。
tf.random.experimental.stateless_split(seed, num):该函数需要传入一个RNG形式的seed(一个shape为2的tensor,类型为int32,或int64。比如seed=[1,2]),和返回seeds的个数num。该函数会返回一个shape为[num, 2]的新的seed。在augment函数代码中我们对该tf.random.experimental.stateless_split的返回值进行了切片操作,就是因为其返回值为[1, 2]的二维数组,而需要传入随机增强中的seed的shape应为(2, )的一维数组。
tf.image.stateless_random_crop和tf.image.stateless_random_brightness都为随机增强函数,在上述示例中都有提到,这里不再叙述。
tf.clip_by_value(image, 0, 1):将image数据限制在[0, 1]之间,调用该函数是为了防止数据增强过程中导致图像的数据不在[0, 1]范围之内。
通过map函数可以灵活的按照自己的要求处理数据,只需要定义一个预处理数据的函数。因此数据增强方法也可以定义在预处理函数中。预处理函数示例如下:
def resize_and_rescale(image, label):
image = tf.cast(image, tf.float32)
image = tf.image.resize(image, [IMG_SIZE, IMG_SIZE])
image = (image / 255.0)
return image, label
先定义一个基本的数据预处理函数,包括图像数据的类型转化,resize和数据归一化(若使用tf.image.convert_image_dtype(image, tf.float32)则会自动将数据变为float32并归一化[0,1)之间)。
def augment(image_label, seed):
image, label = image_label
image, label = resize_and_rescale(image, label)
image = tf.image.resize_with_crop_or_pad(image, IMG_SIZE + 6, IMG_SIZE + 6)
# Make a new seed.
new_seed = tf.random.experimental.stateless_split(seed, num=1)[0, :]
# Random crop back to the original size.
image = tf.image.stateless_random_crop(
image, size=[IMG_SIZE, IMG_SIZE, 3], seed=seed)
# Random brightness.
image = tf.image.stateless_random_brightness(
image, max_delta=0.5, seed=new_seed)
image = tf.clip_by_value(image, 0, 1)
return image, label
再定义一个数据增强函数,其中引用了resize_and_rescale函数对图像进行基本预处理。再增加亿点点细节,就可以随心所欲的增加数据增强方法了。
在上文中我们提到了随机增强函数,可以了解到调用随机增强函数需要传入一个seed参数,传入相同的seed在其他参数相同的情况下会产生相同的增强效果.所以我们需要创造一个seed迭代器,让其每次调用产生不同的seed。
官方提供了tf.random.Generator,我们只需要创建一个tf.random.Generator.from_seed实例,通过每次调用make_seeds方法即可得到随机的种子。具体实现见下文代码。
# Create a generator.
rng = tf.random.Generator.from_seed(123, alg='philox')
说明:
make_seeds(count=1):该方法会返回一个shape为[2, count]的tensor,dtype为int64。
到此随机增强部分就全部明了了,接着使用map函数对数据进行f(x,y)就可以构造出一个经过增强后的数据集了。
train_datasets =
train_datasets
.shuffle(1000)
.map(f, num_parallel_calls=AUTOTUNE)
.batch(batch_size)
.prefetch(AUTOTUNE)
plt.figure(figsize=(8, 8))
for i in range(9):
augmented_image = data_augmentation(image)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_image[0])
plt.axis("off")
model = tf.keras.Sequential([
data_augmentation,
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
])
这样做的好处是:
● 数据增强这块的工作可以得到GPU的加速(如果你使用了GPU训练的话)
注意:只有在模型训练时(Model.fit)才会进行增强,在模型评估(Model.evaluate)以及预测(Model.predict)时并不会进行增强操作。
batch_size = 32
AUTOTUNE = tf.data.experimental.AUTOTUNE
def prepare(ds):
ds = ds.map(lambda x, y: (data_augmentation(x, training=True), y), num_parallel_calls=AUTOTUNE)
return ds
train_ds = prepare(train_ds)
model = tf.keras.Sequential([
layers.Conv2D(16, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(32, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Conv2D(64, 3, padding='same', activation='relu'),
layers.MaxPooling2D(),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dense(len(class_names))
])
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
epochs=20
history = model.fit(
train_ds,
validation_data=val_ds,
epochs=epochs
)
Epoch 1/20
2022-09-30 07:57:46.881833: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8200
2022-09-30 07:57:48.703042: I tensorflow/core/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
75/75 [==============================] - 15s 100ms/step - loss: 0.7076 - accuracy: 0.6189 - val_loss: 0.4560 - val_accuracy: 0.7754
Epoch 2/20
75/75 [==============================] - 6s 75ms/step - loss: 0.4080 - accuracy: 0.8189 - val_loss: 0.3394 - val_accuracy: 0.8430
Epoch 3/20
75/75 [==============================] - 6s 74ms/step - loss: 0.2799 - accuracy: 0.8756 - val_loss: 0.2552 - val_accuracy: 0.8901
Epoch 4/20
75/75 [==============================] - 6s 73ms/step - loss: 0.2386 - accuracy: 0.8975 - val_loss: 0.2243 - val_accuracy: 0.9022
Epoch 5/20
75/75 [==============================] - 6s 75ms/step - loss: 0.2039 - accuracy: 0.9130 - val_loss: 0.1770 - val_accuracy: 0.9251
Epoch 6/20
75/75 [==============================] - 6s 75ms/step - loss: 0.1961 - accuracy: 0.9244 - val_loss: 0.1876 - val_accuracy: 0.9179
Epoch 7/20
75/75 [==============================] - 6s 74ms/step - loss: 0.1720 - accuracy: 0.9286 - val_loss: 0.1435 - val_accuracy: 0.9372
Epoch 8/20
75/75 [==============================] - 6s 76ms/step - loss: 0.1583 - accuracy: 0.9345 - val_loss: 0.1412 - val_accuracy: 0.9396
Epoch 9/20
75/75 [==============================] - 6s 73ms/step - loss: 0.1343 - accuracy: 0.9471 - val_loss: 0.1378 - val_accuracy: 0.9420
Epoch 10/20
75/75 [==============================] - 6s 75ms/step - loss: 0.1349 - accuracy: 0.9513 - val_loss: 0.1194 - val_accuracy: 0.9505
Epoch 11/20
75/75 [==============================] - 6s 74ms/step - loss: 0.1456 - accuracy: 0.9458 - val_loss: 0.1313 - val_accuracy: 0.9469
Epoch 12/20
75/75 [==============================] - 6s 74ms/step - loss: 0.1209 - accuracy: 0.9521 - val_loss: 0.1316 - val_accuracy: 0.9432
Epoch 13/20
75/75 [==============================] - 6s 74ms/step - loss: 0.1300 - accuracy: 0.9487 - val_loss: 0.1215 - val_accuracy: 0.9529
Epoch 14/20
75/75 [==============================] - 6s 74ms/step - loss: 0.1016 - accuracy: 0.9651 - val_loss: 0.1420 - val_accuracy: 0.9517
Epoch 15/20
75/75 [==============================] - 6s 75ms/step - loss: 0.1063 - accuracy: 0.9592 - val_loss: 0.1463 - val_accuracy: 0.9493
Epoch 16/20
75/75 [==============================] - 6s 74ms/step - loss: 0.1083 - accuracy: 0.9538 - val_loss: 0.1392 - val_accuracy: 0.9481
Epoch 17/20
75/75 [==============================] - 6s 74ms/step - loss: 0.0990 - accuracy: 0.9655 - val_loss: 0.1281 - val_accuracy: 0.9481
Epoch 18/20
75/75 [==============================] - 6s 75ms/step - loss: 0.1149 - accuracy: 0.9563 - val_loss: 0.1191 - val_accuracy: 0.9626
Epoch 19/20
75/75 [==============================] - 6s 75ms/step - loss: 0.0910 - accuracy: 0.9634 - val_loss: 0.1791 - val_accuracy: 0.9493
Epoch 20/20
75/75 [==============================] - 6s 74ms/step - loss: 0.0933 - accuracy: 0.9672 - val_loss: 0.0946 - val_accuracy: 0.9674
loss, acc = model.evaluate(test_ds)
print("Accuracy", acc)
6/6 [==============================] - 0s 33ms/step - loss: 0.0861 - accuracy: 0.9688
Accuracy 0.96875
import random
# 这是大家可以自由发挥的一个地方
def aug_img(image):
seed = (random.randint(0,9), 0)
# 随机改变图像对比度
stateless_random_brightness = tf.image.stateless_random_brightness(
image, max_delta=0.95, seed=seed)
return stateless_random_brightness
image = tf.expand_dims(images[3]*255, 0)
print("Min and max pixel values:", image.numpy().min(), image.numpy().max())
Min and max pixel values: 2.4591687 241.47968
plt.figure(figsize=(8, 8))
for i in range(9):
augmented_image = aug_img(image)
ax = plt.subplot(3, 3, i + 1)
plt.imshow(augmented_image[0].numpy().astype("uint8"))
plt.axis("off")