论文题目:《EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks》
论文地址:https://arxiv.org/pdf/1905.11946.pdfarxiv.org
EfficientNet是谷歌在2019年提出的网络结构,相比于之前的其它网络,EfficientNet很大幅度的缩小了模型参数而且提高了预测准确率。通过下面的图,我们可以看到EfficientNet b0-b7在ImageNet上的效果,可以说相比于其他网络它有着碾压级的表现
为了找到一种同时兼顾速度与精度的模型放缩方法,作者总结了前人的网络模型放缩的几个维度,即网络深度、网络宽度和图像分辨率,而且分析出大多数网络只是针对这三个维度中的某一个维度进行调整。作者认为这三个维度之间是互相影响的,综合起来放缩会有更好的效果
图a是一个基线网络,图b,c,d是传统网络的缩放方法,有的网络是对基线网络的宽度进行扩展、有的是对深度进行扩展、还有的是对输入分辨率进行了扩展。而最右边的e图,就是EfficientNet网络的主要思想,即从宽度、深度和分辨率三个方面同时拓充网络的特性。
EfficientNet借鉴深度可分离卷积的思想,设计了MBCConv模块,同时也使用了SE模块对网络结构进行了优化,这次实验我们采用EfficientNet-B0来进行图片分类。
EfficientNet-B0的网络结构如下表所示:
由于tf.keras.applications模块没有内置efficientnet网络,所以我们与要预先安装efficientnet模块
pip install -q efficientnet
链接:https://pan.baidu.com/s/17HsGlsIB2xP8oeUywuwf3A
提取码:kl0h
采用kaggle上的猴子数据集,包含两个文件:训练集和验证集。每个文件夹包含10个标记为n0-n9的猴子。图像尺寸为400x300像素或更大,并且为JPEG格式(近1400张图像)。
图片样本:
#导入相应的库
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import ModelCheckpoint
from sklearn.metrics import confusion_matrix
import efficientnet.tfkeras as efn
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
import itertools
import os
#设置图片的高和宽,一次训练所选取的样本数,迭代次数
im_height = 224
im_width = 224
batch_size = 128
epochs = 15
# 创建保存模型的文件夹
if not os.path.exists("save_weights"):
os.makedirs("save_weights")
image_path = "../input/10-monkey-species/" # 猫狗数据集路径
train_dir = image_path + "training/training" #训练集路径
validation_dir = image_path + "validation/validation" #验证集路径
# 定义训练集图像生成器,并进行图像增强
train_image_generator = ImageDataGenerator( rescale=1./255, # 归一化
rotation_range=40, #旋转范围
width_shift_range=0.2, #水平平移范围
height_shift_range=0.2, #垂直平移范围
shear_range=0.2, #剪切变换的程度
zoom_range=0.2, #剪切变换的程度
horizontal_flip=True, #水平翻转
fill_mode='nearest')
# 使用图像生成器从文件夹train_dir中读取样本,对标签进行one-hot编码
train_data_gen = train_image_generator.flow_from_directory(directory=train_dir, #从训练集路径读取图片
batch_size=batch_size, #一次训练所选取的样本数
shuffle=True, #打乱标签
target_size=(im_height, im_width), #图片resize到224x224大小
class_mode='categorical') #one-hot编码
# 训练集样本数
total_train = train_data_gen.n
# 定义验证集图像生成器,并对图像进行预处理
validation_image_generator = ImageDataGenerator(rescale=1./255) # 归一化
# 使用图像生成器从验证集validation_dir中读取样本
val_data_gen = validation_image_generator.flow_from_directory(directory=validation_dir,#从验证集路径读取图片
batch_size=batch_size, #一次训练所选取的样本数
shuffle=False, #不打乱标签
target_size=(im_height, im_width), #图片resize到224x224大小
class_mode='categorical') #one-hot编码
# 验证集样本数
total_val = val_data_gen.n
执行结果:
Found 1098 images belonging to 10 classes.
Found 272 images belonging to 10 classes.
训练集一共有1098张图片,验证集一共有272张图片,总共10个类别
#使用efficientnet.tfkeras的EfficientNetB0网络,并且使用官方的预训练模型
covn_base = efn.EfficientNetB0(weights='imagenet', include_top=False ,input_shape=[im_height,im_width,3])
covn_base.trainable = True
#冻结前面的层,训练最后10层
for layers in covn_base.layers[:-10]:
layers.trainable = False
#构建模型
model = tf.keras.Sequential([
covn_base,
tf.keras.layers.GlobalAveragePooling2D(), #加入全局平均池化层
tf.keras.layers.Dense(10, activation='softmax') #添加输出层(10分类)
])
#打印每层参数信息
model.summary()
#编译模型
model.compile(
optimizer=tf.keras.optimizers.Adam(), #使用adam优化器
loss = 'categorical_crossentropy', #交叉熵损失函数
metrics=['accuracy'] #评价函数
)
执行结果:
Model: "sequential_5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
efficientnet-b0 (Model) (None, 7, 7, 1280) 4049564
_________________________________________________________________
global_average_pooling2d_5 ( (None, 1280) 0
_________________________________________________________________
dense_5 (Dense) (None, 10) 12810
=================================================================
Total params: 4,062,374
Trainable params: 906,042
Non-trainable params: 3,156,332
_________________________________________________________________
上图是EfficientNetB0模型的网络结构和训练参数,由于大部分层被冻结了,所以可训练的参数很少,只有906,042个参数
def lrfn(epoch):
LR_START = 0.00001 #初始学习率
LR_MAX = 0.0004 #最大学习率
LR_MIN = 0.00001 #学习率下限
LR_RAMPUP_EPOCHS = 5 #上升过程为5个epoch
LR_SUSTAIN_EPOCHS = 0 #学习率保持不变的epoch数
LR_EXP_DECAY = .8 #指数衰减因子
if epoch < LR_RAMPUP_EPOCHS: #第0-第5个epoch学习率线性增加
lr = (LR_MAX - LR_START) / LR_RAMPUP_EPOCHS * epoch + LR_START
elif epoch < LR_RAMPUP_EPOCHS + LR_SUSTAIN_EPOCHS: #不维持
lr = LR_MAX
else: #第6-第15个epoch学习率指数下降
lr = (LR_MAX - LR_MIN) * LR_EXP_DECAY**(epoch - LR_RAMPUP_EPOCHS - LR_SUSTAIN_EPOCHS) + LR_MIN
return lr
#绘制学习率曲线
rng = [i for i in range(epochs)]
y = [lrfn(x) for x in rng]
plt.plot(rng, y)
#print("Learning rate schedule: {:.3g} to {:.3g} to {:.3g}".format(y[0], max(y), y[-1]))
#使用tensorflow中的回调函数LearningRateScheduler设置学习率
lr_schedule = tf.keras.callbacks.LearningRateScheduler(lrfn, verbose=1)
#保存最优模型
checkpoint = ModelCheckpoint(
filepath='./save_weights/myefficientnet.ckpt', #保存模型的路径
monitor='val_acc', #需要监视的值
save_weights_only=False, #若设置为True,则只保存模型权重,否则将保存整个模型(包括模型结构,配置信息等)
save_best_only=True, #当设置为True时,监测值有改进时才会保存当前的模型
mode='auto', #当监测值为val_acc时,模式应为max,当监测值为val_loss时,模式应为min,在auto模式下,评价准则由被监测值的名字自动推断
period=1 #CheckPoint之间的间隔的epoch数
)
#开始训练
history = model.fit(x=train_data_gen, #输入训练集
steps_per_epoch=total_train // batch_size, #一个epoch包含的训练步数
epochs=epochs, #训练模型迭代次数
validation_data=val_data_gen, #输入验证集
validation_steps=total_val // batch_size, #一个epoch包含的训练步数
callbacks=[checkpoint, lr_schedule]) #执行回调函数
#保存训练好的模型权重
model.save_weights('./save_weights/myefficientnet.ckpt',save_format='tf')
# 记录训练集和验证集的准确率和损失值
history_dict = history.history
train_loss = history_dict["loss"] #训练集损失值
train_accuracy = history_dict["accuracy"] #训练集准确率
val_loss = history_dict["val_loss"] #验证集损失值
val_accuracy = history_dict["val_accuracy"] #验证集准确率
执行结果:
Epoch 00001: LearningRateScheduler reducing learning rate to 1e-05.
Epoch 1/15
8/8 [==============================] - 96s 12s/step - loss: 2.3121 - accuracy: 0.1045 - val_loss: 2.3039 - val_accuracy: 0.0742 - lr: 1.0000e-05
Epoch 00002: LearningRateScheduler reducing learning rate to 8.8e-05.
Epoch 2/15
8/8 [==============================] - 91s 11s/step - loss: 2.1725 - accuracy: 0.2062 - val_loss: 1.9805 - val_accuracy: 0.3203 - lr: 8.8000e-05
Epoch 00003: LearningRateScheduler reducing learning rate to 0.000166.
Epoch 3/15
8/8 [==============================] - 89s 11s/step - loss: 1.7967 - accuracy: 0.5237 - val_loss: 1.4632 - val_accuracy: 0.6680 - lr: 1.6600e-04
Epoch 00004: LearningRateScheduler reducing learning rate to 0.000244.
Epoch 4/15
8/8 [==============================] - 91s 11s/step - loss: 1.3409 - accuracy: 0.7619 - val_loss: 0.8830 - val_accuracy: 0.8203 - lr: 2.4400e-04
Epoch 00005: LearningRateScheduler reducing learning rate to 0.000322.
Epoch 5/15
8/8 [==============================] - 88s 11s/step - loss: 0.8884 - accuracy: 0.8546 - val_loss: 0.5035 - val_accuracy: 0.8672 - lr: 3.2200e-04
Epoch 00006: LearningRateScheduler reducing learning rate to 0.0004.
Epoch 6/15
8/8 [==============================] - 91s 11s/step - loss: 0.5728 - accuracy: 0.8897 - val_loss: 0.3326 - val_accuracy: 0.9023 - lr: 4.0000e-04
Epoch 00007: LearningRateScheduler reducing learning rate to 0.000322.
Epoch 7/15
8/8 [==============================] - 92s 12s/step - loss: 0.4229 - accuracy: 0.9144 - val_loss: 0.2310 - val_accuracy: 0.9297 - lr: 3.2200e-04
Epoch 00008: LearningRateScheduler reducing learning rate to 0.0002596000000000001.
Epoch 8/15
8/8 [==============================] - 93s 12s/step - loss: 0.3446 - accuracy: 0.9330 - val_loss: 0.1984 - val_accuracy: 0.9492 - lr: 2.5960e-04
Epoch 00009: LearningRateScheduler reducing learning rate to 0.00020968000000000004.
Epoch 9/15
8/8 [==============================] - 87s 11s/step - loss: 0.3070 - accuracy: 0.9371 - val_loss: 0.1866 - val_accuracy: 0.9492 - lr: 2.0968e-04
Epoch 00010: LearningRateScheduler reducing learning rate to 0.00016974400000000002.
Epoch 10/15
8/8 [==============================] - 92s 12s/step - loss: 0.2924 - accuracy: 0.9464 - val_loss: 0.1775 - val_accuracy: 0.9531 - lr: 1.6974e-04
Epoch 00011: LearningRateScheduler reducing learning rate to 0.00013779520000000003.
Epoch 11/15
8/8 [==============================] - 85s 11s/step - loss: 0.2580 - accuracy: 0.9402 - val_loss: 0.1743 - val_accuracy: 0.9609 - lr: 1.3780e-04
Epoch 00012: LearningRateScheduler reducing learning rate to 0.00011223616000000004.
Epoch 12/15
8/8 [==============================] - 90s 11s/step - loss: 0.2484 - accuracy: 0.9482 - val_loss: 0.1737 - val_accuracy: 0.9570 - lr: 1.1224e-04
Epoch 00013: LearningRateScheduler reducing learning rate to 9.178892800000003e-05.
Epoch 13/15
8/8 [==============================] - 87s 11s/step - loss: 0.2253 - accuracy: 0.9567 - val_loss: 0.1719 - val_accuracy: 0.9570 - lr: 9.1789e-05
Epoch 00014: LearningRateScheduler reducing learning rate to 7.543114240000003e-05.
Epoch 14/15
8/8 [==============================] - 87s 11s/step - loss: 0.2399 - accuracy: 0.9485 - val_loss: 0.1690 - val_accuracy: 0.9570 - lr: 7.5431e-05
Epoch 00015: LearningRateScheduler reducing learning rate to 6.234491392000002e-05.
Epoch 15/15
8/8 [==============================] - 87s 11s/step - loss: 0.2358 - accuracy: 0.9515 - val_loss: 0.1658 - val_accuracy: 0.9570 - lr: 6.2345e-05
从结果可以看出,使用迁移学习的时候模型收敛的速度很快,当训练第15个epoch时训练集准确率为95.15%,验证集的准确率为95.70%
plt.figure()
plt.plot(range(epochs), train_loss, label='train_loss')
plt.plot(range(epochs), val_loss, label='val_loss')
plt.legend()
plt.xlabel('epochs')
plt.ylabel('loss')
plt.figure()
plt.plot(range(epochs), train_accuracy, label='train_accuracy')
plt.plot(range(epochs), val_accuracy, label='val_accuracy')
plt.legend()
plt.xlabel('epochs')
plt.ylabel('accuracy')
plt.show()
def plot_confusion_matrix(cm, target_names,title='Confusion matrix',cmap=None,normalize=False):
accuracy = np.trace(cm) / float(np.sum(cm)) #计算准确率
misclass = 1 - accuracy #计算错误率
if cmap is None:
cmap = plt.get_cmap('Blues') #颜色设置成蓝色
plt.figure(figsize=(10, 8)) #设置窗口尺寸
plt.imshow(cm, interpolation='nearest', cmap=cmap) #显示图片
plt.title(title) #显示标题
plt.colorbar() #绘制颜色条
if target_names is not None:
tick_marks = np.arange(len(target_names))
plt.xticks(tick_marks, target_names, rotation=90) #x坐标标签旋转90度
plt.yticks(tick_marks, target_names) #y坐标
if normalize:
cm = cm.astype('float32') / cm.sum(axis=1)
cm = np.round(cm,2) #对数字保留两位小数
thresh = cm.max() / 1.5 if normalize else cm.max() / 2
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])): #将cm.shape[0]、cm.shape[1]中的元素组成元组,遍历元组中每一个数字
if normalize: #标准化
plt.text(j, i, "{:0.2f}".format(cm[i, j]), #保留两位小数
horizontalalignment="center", #数字在方框中间
color="white" if cm[i, j] > thresh else "black") #设置字体颜色
else: #非标准化
plt.text(j, i, "{:,}".format(cm[i, j]),
horizontalalignment="center", #数字在方框中间
color="white" if cm[i, j] > thresh else "black") #设置字体颜色
plt.tight_layout() #自动调整子图参数,使之填充整个图像区域
plt.ylabel('True label') #y方向上的标签
plt.xlabel("Predicted label\naccuracy={:0.4f}\n misclass={:0.4f}".format(accuracy, misclass)) #x方向上的标签
plt.show() #显示图片
#读取'Common Name'列的猴子类别,并存入到labels中
cols = ['Label','Latin Name', 'Common Name','Train Images', 'Validation Images']
labels = pd.read_csv("../input/10-monkey-species/monkey_labels.txt", names=cols, skiprows=1)
labels = labels['Common Name']
# 预测验证集数据整体准确率
Y_pred = model.predict_generator(val_data_gen, total_val // batch_size + 1)
# 将预测的结果转化为one hit向量
Y_pred_classes = np.argmax(Y_pred, axis = 1)
# 计算混淆矩阵
confusion_mtx = confusion_matrix(y_true = val_data_gen.classes,y_pred = Y_pred_classes)
# 绘制混淆矩阵
plot_confusion_matrix(confusion_mtx, normalize=True, target_names=labels)
执行结果:
可以看出,验证集中10种猴子大部分预测的准确率为95%以上,整体预测的准确率为95.59%,说明效果还是不错的