论文提交ICLR 2017
论文地址:https://arxiv.org/abs/1602.07360
代码地址:https://github.com/DeepScale/SqueezeNet
注:代码只放出了prototxt文件和训练好的caffemodel,因为整个网络都是基于caffe的,有这两样东西就足够了。
在这里只是简要的介绍文章的内容,具体细节的东西可以自行翻阅论文。
在相同的精度下,模型参数更少有3个好处:
即 高效的分布式训练、更容易替换模型、更方便FPGA和嵌入式部署。
鉴于此,提出3种策略:
即
关于SqueezeNet的构建细节在文中也有详细的描述
这里,我首先尝试的是使用alt-opt,但是很遗憾的是,出来的结果很糟糕,基本不能用,后来改为使用end2end,在最开始的时候,采用的就是faster rcnn官方提供的zfnet end2end训练的solvers,又很不幸的是,在网络运行大概400步后出现:
loss = NAN
遇到这个问题,把学习率改为以前的1/10,解决。
直接上prototxt文件,前面都是一样的,只需要改动zfnet中的conv1-con5部分,外加把fc6-fc7改成squeeze中的卷积链接。
prototxt太长,给出每个部分的前面和后面部分:
name: "Alex_Squeeze_v1.1"
layer {
name: 'input-data'
type: 'Python'
top: 'data'
top: 'im_info'
top: 'gt_boxes'
python_param {
module: 'roi_data_layer.layer'
layer: 'RoIDataLayer'
param_str: "'num_classes': 4"
}
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 64
kernel_size: 3
stride: 2
}
}
.
.
.
layer {
name: "drop9"
type: "Dropout"
bottom: "fire9/concat"
top: "fire9/concat"
dropout_param {
dropout_ratio: 0.5
}
}
#========= RPN ============
layer {
name: "rpn_conv/3x3"
type: "Convolution"
bottom: "fire9/concat"
top: "rpn/output"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 256
kernel_size: 3 pad: 1 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
.
.
.
layer {
name: "drop9"
type: "Dropout"
bottom: "fire9/concat"
top: "fire9/concat"
dropout_param {
dropout_ratio: 0.5
}
}
#========= RPN ============
layer {
name: "rpn_conv/3x3"
type: "Convolution"
bottom: "fire9/concat"
top: "rpn/output"
param { lr_mult: 1.0 }
param { lr_mult: 2.0 }
convolution_param {
num_output: 256
kernel_size: 3 pad: 1 stride: 1
weight_filler { type: "gaussian" std: 0.01 }
bias_filler { type: "constant" value: 0 }
}
}
.
.
.
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 4"
}
}
#===================== RCNN =============
layer {
name: "roi_pool5"
type: "ROIPooling"
bottom: "fire9/concat"
bottom: "rois"
top: "roi_pool5"
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.0625 # 1/16
}
}
layer {
name: "conv1_last"
type: "Convolution"
bottom: "roi_pool5"
top: "conv1_last"
param { lr_mult: 1.0 }
param { lr_mult: 1.0 }
convolution_param {
num_output: 1000
kernel_size: 1
weight_filler {
type: "gaussian"
mean: 0.0
std: 0.01
}
}
}
layer {
name: "relu/conv1_last"
type: "ReLU"
bottom: "conv1_last"
top: "relu/conv1_last"
}
layer {
name: "cls_score"
type: "InnerProduct"
bottom: "relu/conv1_last"
top: "cls_score"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 5
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "bbox_pred"
type: "InnerProduct"
bottom: "relu/conv1_last"
top: "bbox_pred"
param {
lr_mult: 1
}
param {
lr_mult: 2
}
inner_product_param {
num_output: 20
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss_cls"
type: "SoftmaxWithLoss"
bottom: "cls_score"
bottom: "labels"
propagate_down: 1
propagate_down: 0
top: "loss_cls"
loss_weight: 1
}
layer {
name: "loss_bbox"
type: "SmoothL1Loss"
bottom: "bbox_pred"
bottom: "bbox_targets"
bottom: "bbox_inside_weights"
bottom: "bbox_outside_weights"
top: "loss_bbox"
loss_weight: 1
}
后面一部分的结构如图:
注意红圈部分,以前的fc换成了squ中的卷积层,这样网络参数大大减少,因为我改动了rpn部分选proposal的比例和数量,共采用改了70种选择,所以最后训练出来的模型为17M,比初始化4.8M大很多,不过也已经很小了。
OHEM无非就是多了一个readonly部分,不过加上之后效果会好很多,和上面的方式一致,放出一部分prototxt,其他的课自行补上。从rpn那里开始,前面部分和上面给出的完全一样
#====== RoI Proposal ====================
layer {
name: "rpn_cls_prob"
type: "Softmax"
bottom: "rpn_cls_score_reshape"
top: "rpn_cls_prob"
}
layer {
name: 'rpn_cls_prob_reshape'
type: 'Reshape'
bottom: 'rpn_cls_prob'
top: 'rpn_cls_prob_reshape'
reshape_param { shape { dim: 0 dim: 140 dim: -1 dim: 0 } }
}
layer {
name: 'proposal'
type: 'Python'
bottom: 'rpn_cls_prob_reshape'
bottom: 'rpn_bbox_pred'
bottom: 'im_info'
top: 'rpn_rois'
python_param {
module: 'rpn.proposal_layer'
layer: 'ProposalLayer'
param_str: "'feat_stride': 16"
}
}
layer {
name: 'roi-data'
type: 'Python'
bottom: 'rpn_rois'
bottom: 'gt_boxes'
top: 'rois'
top: 'labels'
top: 'bbox_targets'
top: 'bbox_inside_weights'
top: 'bbox_outside_weights'
python_param {
module: 'rpn.proposal_target_layer'
layer: 'ProposalTargetLayer'
param_str: "'num_classes': 4"
}
}
##########################
## Readonly RoI Network ##
######### Start ##########
layer {
name: "roi_pool5_readonly"
type: "ROIPooling"
bottom: "fire9/concat"
bottom: "rois"
top: "pool5_readonly"
propagate_down: false
propagate_down: false
roi_pooling_param {
pooled_w: 6
pooled_h: 6
spatial_scale: 0.0625 # 1/16
}
}
layer {
name: "conv1_last_readonly"
type: "Convolution"
bottom: "pool5_readonly"
top: "conv1_last_readonly"
propagate_down: false
param {
name: "conv1_last_w"
}
param {
name: "conv1_last_b"
}
convolution_param {
num_output: 1000
kernel_size: 1
weight_filler {
type: "gaussian"
mean: 0.0
std: 0.01
}
}
}
layer {
name: "relu/conv1_last_readonly"
type: "ReLU"
bottom: "conv1_last_readonly"
top: "relu/conv1_last_readonly"
propagate_down: false
}
layer {
name: "cls_score_readonly"
type: "InnerProduct"
bottom: "relu/conv1_last_readonly"
top: "cls_score_readonly"
propagate_down: false
param {
name: "cls_score_w"
}
param {
name: "cls_score_b"
}
inner_product_param {
num_output: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "bbox_pred_readonly"
type: "InnerProduct"
bottom: "relu/conv1_last_readonly"
top: "bbox_pred_readonly"
propagate_down: false
param {
name: "bbox_pred_w"
}
param {
name: "bbox_pred_b"
}
inner_product_param {
num_output: 16
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "cls_prob_readonly"
type: "Softmax"
bottom: "cls_score_readonly"
top: "cls_prob_readonly"
propagate_down: false
}
layer {
name: "hard_roi_mining"
type: "Python"
bottom: "cls_prob_readonly"
bottom: "bbox_pred_readonly"
bottom: "rois"
bottom: "labels"
bottom: "bbox_targets"
bottom: "bbox_inside_weights"
bottom: "bbox_outside_weights"
top: "rois_hard"
top: "labels_hard"
top: "bbox_targets_hard"
top: "bbox_inside_weights_hard"
top: "bbox_outside_weights_hard"
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
python_param {
module: "roi_data_layer.layer"
layer: "OHEMDataLayer"
param_str: "'num_classes': 4"
}
}
########## End ###########
## Readonly RoI Network ##
##########################
#===================== RCNN =============
layer {
name: "roi_pool5"
type: "ROIPooling"
bottom: "fire9/concat"
bottom: "rois_hard"
top: "roi_pool5"
propagate_down: true
propagate_down: false
roi_pooling_param {
pooled_w: 7
pooled_h: 7
spatial_scale: 0.0625 # 1/16
}
}
layer {
name: "conv1_last"
type: "Convolution"
bottom: "roi_pool5"
top: "conv1_last"
param {
lr_mult: 1.0
name: "conv1_last_w"
}
param {
lr_mult: 1.0
name: "conv1_last_b"
}
convolution_param {
num_output: 1000
kernel_size: 1
weight_filler {
type: "gaussian"
mean: 0.0
std: 0.01
}
}
}
layer {
name: "relu/conv1_last"
type: "ReLU"
bottom: "conv1_last"
top: "relu/conv1_last"
}
layer {
name: "cls_score"
type: "InnerProduct"
bottom: "relu/conv1_last"
top: "cls_score"
param {
lr_mult: 1
name: "cls_score_w"
}
param {
lr_mult: 2
name: "cls_score_b"
}
inner_product_param {
num_output: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "bbox_pred"
type: "InnerProduct"
bottom: "relu/conv1_last"
top: "bbox_pred"
param {
lr_mult: 1
name: "bbox_pred_w"
}
param {
lr_mult: 2
name: "bbox_pred_b"
}
inner_product_param {
num_output: 16
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "loss_cls"
type: "SoftmaxWithLoss"
bottom: "cls_score"
bottom: "labels_hard"
propagate_down: true
propagate_down: false
top: "loss_cls"
loss_weight: 1
}
layer {
name: "loss_bbox"
type: "SmoothL1Loss"
bottom: "bbox_pred"
bottom: "bbox_targets_hard"
bottom: "bbox_inside_weights_hard"
bottom: "bbox_outside_weights_hard"
top: "loss_bbox"
loss_weight: 1
propagate_down: false
propagate_down: false
propagate_down: false
propagate_down: false
}
结构图如下:
比前面训练的多一个readonly部分,具体可参考论文:
Training Region-based Object Detectors with Online Hard Example Mining
https://arxiv.org/abs/1604.03540
至此,SqueezeNet+Faster RCNN 框架便介绍完了,运行速度在GPU下大概是ZF的5倍,CPU下大概为2。5倍。
http://blog.csdn.net/u011956147/article/details/53714616