本文分析的代码来自https://github.com/yhenon/keras-frcnn,侵删。
keras是一种类似tensorflow的Python语言下的包,同样可以用于卷积神经网络的设计,相较于tensorflow实现相对容易但本身仍需tensorflow或theano的支持,本文分析tensorflow支持下的keras实现,keras=2.0.3,数据集选择Pascal VOC2007,需要下载后手动将train和test数据集进行合并并放置于根目录下的train_path文件夹下(需自己创建)。
数据集下载 地址:https://pjreddie.com/projects/pascal-voc-dataset-mirror/
代码起始于train_frcnn:
1.首先设置数据存放路径,图像的调整相关参数,提取网络的选择,权重的加载(如果存在预权重),提取VGG16预加载数据,设定初始参数
2.获取Pascal VOC2007数据集的相关信息,包括图片的地址、width、height、bboxes()存放于all_imgs,类别数量统计,排序各类别,其中分类的类别中应包括背景,随机打乱图片的顺序,扩展数据集(旋转…)
3.调用已搭建好的卷积网络,实现shared convolutions的构建,代码如下:
from keras.models import Model
from keras.layers import Flatten, Dense, Input, Conv2D, MaxPooling2D, Dropout
from keras.layers import GlobalAveragePooling2D, GlobalMaxPooling2D, TimeDistributed
# Block 1
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)
# x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
由上述代码可以看出,keras的网络搭建相对容易,更加易于编写,也更加适合于fine-tune,即训练的时候layer.trainable = False设置无需训练的各层,并且各部分的训练可以联合也可以独立。
4.重复上述操作进行网络的搭建,并编写loss,均完成后进行compile然后实施反向传播,反向传播过程中将始终保存loss较低的模型,最终会保存loss最低的模型 ,测试结果不会受到过分训练导致的结果恶化影响