Caffe-SSD-MobileNetV1

一.配置caffe-ssd

  • github 地址: https://github.com/weiliu89/caffe
  • 利用以下命令:下载caffe-ssd(自行改名)
git clone https://github.com/chuanqi305/ssd.git
cd caffe
git checkout ssd
  • 由于MobileNet多了relu6_layerConvolutionDepthwiseDepthwiseConvolutionalLayer,所以你应该按照以下步骤:
  • 我将这个三个文件的(cpp、hpp、cu)文件上传百度:待续。。

  • .cpp和.cu文件 移动到 /path/to/caffe-ssd/src/caffe/layers/
  • .hpp文件 移动到  /path/to/caffe-ssd/include/caffe/layers/
  • 将以下代码添加到caffe的 caffe.proto 文件:
optional ReLU6Parameter relu6_param = 100000; //relu6
optional ConvolutionDepthwiseParameter convolution_depthwise_param = 151; //convolution_depthwise

&

//relu6
message ReLU6Parameter {
  optional float negative_slope = 1 [default = 0];
}

//convolution_depthwise
message ConvolutionDepthwiseParameter{
  optional uint32 num_output = 1; // The number of outputs for the layer
  optional bool bias_term = 2 [default = true]; // whether to have bias terms

  // Pad, kernel size, and stride are all given as a single value for equal
  // dimensions in all spatial dimensions, or once per spatial dimension.
  repeated uint32 pad = 3; // The padding size; defaults to 0
  repeated uint32 kernel_size = 4; // The kernel size
  repeated uint32 stride = 6; // The stride; defaults to 1
  // Factor used to dilate the kernel, (implicitly) zero-filling the resulting
  // holes. (Kernel dilation is sometimes referred to by its use in the
  // algorithme à trous from Holschneider et al. 1987.)
  repeated uint32 dilation = 18; // The dilation; defaults to 1

  // For 2D convolution only, the *_h and *_w versions may also be used to
  // specify both spatial dimensions.
  optional uint32 pad_h = 9 [default = 0]; // The padding height (2D only)
  optional uint32 pad_w = 10 [default = 0]; // The padding width (2D only)
  optional uint32 kernel_h = 11; // The kernel height (2D only)
  optional uint32 kernel_w = 12; // The kernel width (2D only)
  optional uint32 stride_h = 13; // The stride height (2D only)
  optional uint32 stride_w = 14; // The stride width (2D only)

  optional uint32 group = 5 [default = 1]; // The group size for group conv

  optional FillerParameter weight_filler = 7; // The filler for the weight
  optional FillerParameter bias_filler = 8; // The filler for the bias
  enum Engine {
    DEFAULT = 0;
    CAFFE = 1;
    CUDNN = 2;
  }
  optional Engine engine = 15 [default = DEFAULT];

  // The axis to interpret as "channels" when performing convolution.
  // Preceding dimensions are treated as independent inputs;
  // succeeding dimensions are treated as "spatial".
  // With (N, C, H, W) inputs, and axis == 1 (the default), we perform
  // N independent 2D convolutions, sliding C-channel (or (C/g)-channels, for
  // groups g>1) filters across the spatial axes (H, W) of the input.
  // With (N, C, D, H, W) inputs, and axis == 1, we perform
  // N independent 3D convolutions, sliding (C/g)-channels
  // filters across the spatial axes (D, H, W) of the input.
  optional int32 axis = 16 [default = 1];

  // Whether to force use of the general ND convolution, even if a specific
  // implementation for blobs of the appropriate number of spatial dimensions
  // is available. (Currently, there is only a 2D-specific convolution
  // implementation; for input blobs with num_axes != 2, this option is
  // ignored and the ND implementation will be used.)
  optional bool force_nd_im2col = 17 [default = false];

}

 

  • 接着安装编译caffe见博客:https://blog.csdn.net/qq_40755643/article/details/96346453

 

二.配置 MobileNet-SSD

MobileNet-SSD 是依赖于我们刚才配置的caffe-ssd

github下载地址:https://github.com/chuanqi305/MobileNet-SSD

将该文件夹放在caffe-ssd/examples下。

其中 MobileNet-SSD-master/template 文件夹下有我们需要训练的 .prototxt 文件,我们需要整改网络,包括修改输入数据集文件夹的路径,输入图片的尺寸,输入的 batch_size 等等参数可以在这里修改。

Caffe-SSD-MobileNetV1_第1张图片

然后在 MobileNet-SSD 文件夹下,通过 gen_model.sh 来将 template 文件夹里的文件转化成训练文件。我的数据集是识别2种目标,再加上背景,所以后面跟 3。 运行成功后会自动生成一个 example 的文件夹,里面包含的是训练所用的文件。

sh gen_model.sh num_class+1

example 的文件夹:


三.将数据转为VOC数据格式

在根目录创造文件夹data/VOCdekit/Hisense

格式如下:

VOCdevkit 
——Hisense 
————Annotations #放入所有的xml文件 
————ImageSets 
——————Main #放入train.txt,val.txt文件 
————JPEGImages #放入所有的图片文件 
Main中的文件分别表示train.txt是训练集,val.txt是验证集

数据集分成训练集和测试集,用一个python脚本 classify.py 来自己生成.txt文件

import os
import random

trainval_percent = 0.8    #  修改训练集与测试集比例,此时train:test=8:2
train_percent = 0.7       #  train 占 trainval 中的 0.7 
fdir = '/home/xxx/data/VOCdevkit/Hisense/ImageSets/Main/'      # 修改对应路径
xmlfilepath = '/home/xxx/data/VOCdevkit/Hisense/Annotations/'  # 修改对应路径
txtsavepath = fdir
total_xml = os.listdir(xmlfilepath)

num=len(total_xml)
list=range(num)
tv=int(num*trainval_percent)
tr=int(tv*train_percent)
trainval= random.sample(list,tv)
train=random.sample(trainval,tr)

ftrainval = open(fdir + 'trainval.txt', 'w')
ftest = open(fdir + 'test.txt', 'w')
ftrain = open(fdir + 'train.txt', 'w')
fval = open(fdir + 'val.txt', 'w')

for i  in list:
    name=total_xml[i][:-4]+'\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest .close()

运行后就会在 /ImageSets/Main/ 文件夹中出现下面4个txt文件:

 

接下来就是将图片文件转化为训练用的 lmdb 文件了。

caffe-ssd/data新建文件夹Hisense,并将VOC0712 文件夹下的create_data.sh、create_list.sh、labelmap_voc.prototxt复制到Hisense文件夹下。

labelmap_voc.prototxt

这个文件主要是输入标签与文本的对应关系,以我自己的数据集是两分类,还有一个背景,所以是一个3分类,在我的xml文件中的标签分别为 ab ,里边的 name 要和你自己的数据集的 xml 文件名保持一致。所以内容改为下面就可以了:

item {
  name: "none_of_the_above"
  label: 0
  display_name: "background"
}
item {
  name: "a"
  label: 1
  display_name: "a"
}
item {
  name: "b"
  label: 2
  display_name: "b"
}

create_list.sh

这个文件主要是根据我们刚才的 /ImageSets/Main/ 文件夹下的 4 个txt 文件 进一步生成转换 lmdb 需要用的形式, 为下一步 create_data.sh 文件的运行提供方便。 主要修改部分就是后面带有备注的几行。根据自己的情况修改对应路径及文件名称就可以了。
 

#!/bin/bash

root_dir=$HOME/data/VOCdevkit/ #数据集根目录
sub_dir=ImageSets/Main         #创建的Main文件夹路径
bash_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
for dataset in trainval test    #训练集和测试集
do
  dst_file=$bash_dir/$dataset.txt
  if [ -f $dst_file ]
  then
    rm -f $dst_file
  fi
  for name in Hisense           #数据集名称
  do
    if [[ $dataset == "test" && $name == "VOC2012" ]]
    then
      continue
    fi
    echo "Create list for $name $dataset..."
    dataset_file=$root_dir/$name/$sub_dir/$dataset.txt

    img_file=$bash_dir/$dataset"_img.txt"
    cp $dataset_file $img_file
    sed -i "s/^/$name\/JPEGImages\//g" $img_file
    sed -i "s/$/.jpg/g" $img_file

    label_file=$bash_dir/$dataset"_label.txt"
    cp $dataset_file $label_file
    sed -i "s/^/$name\/Annotations\//g" $label_file
    sed -i "s/$/.xml/g" $label_file

    paste -d' ' $img_file $label_file >> $dst_file

    rm -f $label_file
    rm -f $img_file
  done

  # Generate image name and size infomation.
  if [ $dataset == "test" ]
  then
    $bash_dir/../../build/tools/get_image_size $root_dir $dst_file $bash_dir/$dataset"_name_size.txt"
  fi

  # Shuffle trainval file.
  if [ $dataset == "trainval" ]
  then
    rand_file=$dst_file.random
    cat $dst_file | perl -MList::Util=shuffle -e 'print shuffle();' > $rand_file
    mv $rand_file $dst_file
  fi
done

在caffe-ssd/data/Hisense目录下运行:

./create_list.sh

可以看见在Hisense文件夹下下会生成test.txt,trainval.txt,test_name_size.txt 这三个txt 文件:

create_data.sh

构建LMDB数据集。

cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )
root_dir=$cur_dir/../..

cd $root_dir

redo=1
data_root_dir="$HOME/data/VOCdevkit"    #数据集根目录
dataset_name="Hisense"                  #数据集名称
mapfile="$root_dir/data/$dataset_name/labelmap_voc.prototxt"    #标签的目录
anno_type="detection"
db="lmdb"
min_dim=0
max_dim=0
width=0
height=0

extra_cmd="--encode-type=jpg --encoded"
if [ $redo ]
then
  extra_cmd="$extra_cmd --redo"
fi
for subset in test trainval
do
  python $root_dir/scripts/create_annoset.py --anno-type=$anno_type --label-map-file=$mapfile --min-dim=$min_dim --max-dim=$max_dim --resize-width=$width --resize-height=$height --check-label $extra_cmd $data_root_dir $root_dir/data/$dataset_name/$subset.txt $data_root_dir/$dataset_name/$db/$dataset_name"_"$subset"_"$db examples/$dataset_name
done

在caffe-ssd/data/Hisense目录下运行:

./create_data.sh

会在 /home/xxx/data/VOCdevkit/Hisense/ 对应路径下生成lmdb的文件夹

其中会包含生成生成两个 lmdb 文件:Hisense_trainval_lmdbHisense_test_lmdb


四.训练

将对应的 lmdb 文件路径加入刚才用 gen_model.sh 生成的 example 文件夹中的 train.prototxt 和 test.prototxt

data_param {
    source: "/home/boyun/Hisense/data/lmdb/Hisense_trainval_lmdb"
    batch_size: 24
    backend: LMDB
  }

&

data_param {
    source: "/home/boyun/Hisense/data/lmdb/Hisense_test_lmdb"
    batch_size: 8
    backend: LMDB
  }

然后修改 solve_train.prototxt 中的训练超参数和路径,参数解释:solve_train.prototxt 中的训练超参数含义

train_net: "example/MobileNetSSD_train.prototxt"
test_net: "example/MobileNetSSD_test.prototxt"
test_iter: 673
test_interval: 1000
base_lr: 0.0005
display: 10
max_iter: 52000
lr_policy: "multistep"
gamma: 0.5
weight_decay: 0.00005
snapshot: 1000
snapshot_prefix: "snapshot/mobilenet"
solver_mode: GPU
debug_info: false
snapshot_after_train: true
test_initialization: false
average_loss: 10
stepvalue: 20000
stepvalue: 40000
iter_size: 1
type: "RMSProp"
eval_type: "detection"
ap_version: "11point"

最后修改 train.sh 设置好训练的caffe绝对路径(文件跟我这样放不用改)、预训练权重、gpu核

#!/bin/sh
if ! test -f example/MobileNetSSD_train.prototxt ;then
	echo "error: example/MobileNetSSD_train.prototxt does not exist."
	echo "please use the gen_model.sh to generate your own model."
        exit 1
fi
mkdir -p snapshot
../../build/tools/caffe train -solver="solver_train.prototxt" \ 
-weights="mobilenet_iter_73000.caffemodel" \
-gpu 0 

在/home/xxx/caffe-ssd/examples/MobileNet-SSD-master目录下运行:

./train.sh

就可以开始训练了。结果如下:

I0801 16:50:40.645901  7363 caffe.cpp:251] Starting Optimization
I0801 16:50:40.645918  7363 solver.cpp:294] Solving MobileNet-SSD
I0801 16:50:40.645923  7363 solver.cpp:295] Learning Rate Policy: multistep
I0801 16:50:40.649744  7363 blocking_queue.cpp:50] Data layer prefetch queue empty
I0801 16:50:42.626246  7363 solver.cpp:243] Iteration 0, loss = 12.7011
I0801 16:50:42.626294  7363 solver.cpp:259]     Train net output #0: mbox_loss = 12.7011 (* 1 = 12.7011 loss)
I0801 16:50:42.626328  7363 sgd_solver.cpp:138] Iteration 0, lr = 0.0005
I0801 16:51:07.516057  7363 solver.cpp:243] Iteration 10, loss = 7.21989
I0801 16:51:07.516435  7363 solver.cpp:259]     Train net output #0: mbox_loss = 5.77298 (* 1 = 5.77298 loss)
I0801 16:51:07.516465  7363 sgd_solver.cpp:138] Iteration 10, lr = 0.0005

五.测试

修改demo.py的配置:

caffe_root = '/home/xxx/caffe-ssd/' #caffe路径
sys.path.insert(0, caffe_root + 'python')  
import caffe  


net_file= 'deploy.prototxt'        #
caffe_model='mobilenet_iter_12000.caffemodel' #模型
test_dir = "images"                            #图片

if not os.path.exists(caffe_model):
    print(caffe_model + " does not exist")
    exit()
if not os.path.exists(net_file):
    print(net_file + " does not exist")
    exit()
net = caffe.Net(net_file,caffe_model,caffe.TEST)  

CLASSES = ('background',                        #标签
           'a', 'b')

为了提高模型运行速度,作者在这里将bn层合并到了卷积层中,相当于bn的计算时间就被节省了,对检测速度可能有小幅度的帮助

打开merge_bn.py文件,然后注意修改其中的caffe路径,然后运行:

python merge_bn.py --model example/MobileNetSSD_deploy.prototxt --weights mobilenet_iter_12000.caffemodel

 

完!

你可能感兴趣的:(Caffe-SSD-MobileNetV1)