linux下caffe-ssd在cuda10+python3.6下编译,训练VOC数据集和测试

简介

之前源码编译安装了caffe,但是对于caffe-ssd依旧需要编译安装 。

个人环境

Ubuntu18.04,

python3.6

CUDA 10.0,

cuDNN 7.6.3

opencv3.4.6

     基本上都是在普通用户(zhangyt)目录下安装的,服务器中的CUDA10和CUDNN已经安装好了(都在正常的位置),python也已经在根目录下安装好了,opencv3.4.6我是直接在自己的用户目录下(caffe)安装完成了,具体可以看我的博客

1、caffe-ssd环境编译

现在

git clone https://github.com/weiliu89/caffe.git 
cd caffe 
git checkout ssd

1.1 修改Makefile.configure

参考之前编译安装的caffe

https://blog.csdn.net/pursuit_zhangyu/article/details/104770610

在修改之前也要安装opencv,参考https://blog.csdn.net/pursuit_zhangyu/article/details/104754087

1.2 修改Makefile

vim Makefile

 记得添加opencv_videoio,其他都一样

LIBRARIES += glog gflags protobuf leveldb snappy 
 lmdb boost_system boost_filesystem hdf5_hl hdf5 m 
 opencv_core opencv_highgui opencv_imgproc opencv_imgcodecs opencv_videoio

 1.3 现在开编译安装

make all -j16
make py -j16
make test -j16

一些make runtest中的测试就算不通过也不会耽误使用的,一般只要make all , make pycaffe 和 make test不出错即可。

报错

/include/caffe/util/cudnn.hpp:21:10: warning: enumeration value ‘CUDNN_STATUS_RUNTIME_IN_PROGRESS’ 

解决方法

出现cudnn的错误了,类似于这个

https://github.com/BVLC/caffe/issues/5793#issuecomment-404729152

将这个cudnn.hpp文件

https://github.com/BVLC/caffe/blob/master/include/caffe/util/cudnn.hpp

复制到在/caffe/include/caffe/util/cudnn.hpp文件夹里面就行了

1.4 添加环境变量

vim ~/.bashrc

#caffe
export PYTHONPATH=/home/zhangyt/caffe/python:$PYTHONPATH

source ~/.bashrc 

2、准备

2.1 VOC数据集下载

VOC2007 和 VOC2012 数据集

进入caffe主目录下的data目录下:

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar

 如果下载不了,我提供一个百度云盘

链接:https://pan.baidu.com/s/1JS14v3uKkB-cDqZwtgEWIA 
提取码:8uhn 

然后解压

tar -xvf VOCtrainval_11-May-2012.tar 
tar -xvf VOCtrainval_06-Nov-2007.tar 
tar -xvf VOCtest_06-Nov-2007.tar

VGG16模型下载

cd ~/caffe/models
mkdir VGGNet 
cd VGGNet

将下载好的VGG caffemodel,传到VGGNet里面

https://raw.githubusercontent.com/conner99/VGGNet/master/VGG_ILSVRC_16_layers_fc_reduced.caffemodel

2.2 将VOC数据转为LMDB数据格式

cd ~/caffe                               # 进入 Caffe 根目录
vim data/VOC0712/create_list.sh
vim data/VOC0712/create_data.sh

(1)修改create_list.sh

create_list.sh解析

https://blog.csdn.net/qq_32149483/article/details/92250998

https://blog.csdn.net/qq_21368481/article/details/82350331

数据放在~/data目录下,不需要修改

 

(2)修改create_data.sh

create_data.sh解析

https://blog.csdn.net/maum61/article/details/99606726

修改

width=300
height=300

(3)修改labelmap_voc.prototxt

如果是用VOC数据集进行训练就不用修改,如果是自己的数据集训练,就需要修改,背景也算为一类。

 

(4)生成LMDB数据集

cd ~/caffe                               # 进入 Caffe 根目录
./data/VOC0712/create_list.sh

主要是生成文件test_name_size.txt    test.txt   trainval.txt  ,前者记录测试图片的索引号和长宽,后者为test 与 trainval 与xml文件的合并。

./data/VOC0712/create_data.sh

 类似下面这种,其中上面三行是我之前调试的输出

linux下caffe-ssd在cuda10+python3.6下编译,训练VOC数据集和测试_第1张图片

3、训练

(1)修改 ssd_pascal.py

终于进行到最后一步了 ,只需修改 ssd_pascal.py

cd ~/caffe/examples/ssd              # 进入 ssd_pascal.py 所在目录
cp ssd_pascal.py my_ssd_pascal.py    # 尽量不动 ssd_pascal.py 源文件
vim my_ssd_pascal.py          # 修改 my_ssd_pascal.py

小提示

vim显示行数,按下Esc,然后键入

:set number

 

# 82 行修改 LMDB 文件位置信息
# 上一步执行 *.sh 文件时候输出了 LMDB file 的存放位置
# The database file for training data. Created by data/VOC0712/create_data.sh
train_data = "/home/zhangyt/data/VOCdevkit/VOC0712/lmdb/VOC07_trainval_lmdb"
# The database file for testing data. Created by data/VOC0712/create_data.sh
test_data = "/home/zhangyt/data/VOCdevkit/VOC0712/lmdb/VOC07_test_lmdb"
 
# 258 行修改必要信息
# Stores the test image names and sizes. Created by data/VOC0712/create_list.sh
name_size_file = "data/VOC0712/test_name_size.txt"
# The pretrained model. We use the Fully convolutional reduced (atrous) VGGNet.
pretrain_model = "models/VGGNet/VGG_ILSVRC_16_layers_fc_reduced.caffemodel"
# Stores LabelMapItem.
label_map_file = "data/VOC0712/labelmap_voc.prototxt"
# MultiBoxLoss parameters.
num_classes = 21            # 修改为你要的分类数+1。例如我是2分类,我写了 2+1=3
 
# 332 行修改 GPU 信息
gpus = "4"   # 你要开启几个 GPU 加速就写几个,编号从0开始。我用一个2070 就写了 0
 
# 337 修改 batch size 大小
batch_size = 16    # 这个数字大小看你显存大小填写, 允许范围内越大越好
 
# 359 行修改测试集图片数
num_test_image = 4952   # 根据你测试集图片数实际填写 
# 图片数为 $caffe_root/data/VOCdevkit/MyDataSet/ImageSets/Main/test.txt 的行数
 
# 修改好后 :wq 保存退出

 

cd ~/caffe
python examples/ssd/my_ssd_pascal.py

因为代码是基于py2的,所以需要修改

问题1

NameError: name 'xrange' is not defined

两处

解决方法

vim examples/ssd/my_ssd_pascal.py

将里面的xrange换成range,Python 3中,range()与xrange()合并为range( )。

问题2

执行

python examples/ssd/my_ssd_pascal.py

出现

Traceback (most recent call last):
  File "examples/ssd/my_ssd_pascal.py", line 446, in 
    AddExtraLayers(net, use_batchnorm, lr_mult=lr_mult)
  File "examples/ssd/my_ssd_pascal.py", line 25, in AddExtraLayers
    lr_mult=lr_mult)
  File "/home/zhangyt/caffe/python/caffe/model_libs.py", line 93, in ConvBNLayer
    [kernel_h, kernel_w] = UnpackVariable(kernel_size, 2)
  File "/home/zhangyt/caffe/python/caffe/model_libs.py", line 16, in UnpackVariable
    assert len > 0
TypeError: '>' not supported between instances of 'builtin_function_or_method' and 'int'

解决方法

vim /home/zhangyt/caffe/python/caffe/model_libs.py

将assert len > 0注释掉变成下面,16行

#assert int(len) > 0

问题3

执行

python examples/ssd/my_ssd_pascal.py

出现

zhangyt@ubuntu:~/caffe$ python examples/ssd/my_ssd_pascal.py
/usr/bin/python3
Traceback (most recent call last):
  File "examples/ssd/my_ssd_pascal.py", line 463, in 
    print(net.to_proto(), file=f)
  File "/home/zhangyt/caffe/python/caffe/net_spec.py", line 209, in to_proto
    top._to_proto(layers, names, autonames)
  File "/home/zhangyt/caffe/python/caffe/net_spec.py", line 100, in _to_proto
    return self.fn._to_proto(layers, names, autonames)
  File "/home/zhangyt/caffe/python/caffe/net_spec.py", line 162, in _to_proto
    _param_names[self.type_name] + '_param'), k, v)
  File "/home/zhangyt/caffe/python/caffe/net_spec.py", line 74, in assign_proto
    getattr(proto, name).extend(val)
TypeError: 1.0 has type float, but expected one of: int, long

原因:在python3中 / 的结果是 float 导致的

解决方法

参考:https://blog.csdn.net/dihe874981/article/details/86622906

vim /home/zhangyt/caffe/python/caffe/model_libs.py

156行(vim中输入,156gg可以快速到达156行)

pad = int((3 + (dilation - 1) * 2) - 1) / 2
改为
pad = int((3 + (dilation - 1) * 2) - 1) // 2

375行,如上

pad = int((kernel_size + (dilation - 1) * (kernel_size - 1)) - 1) / 2

417行。如上

pad = int((kernel_size + (dilation - 1) * (kernel_size - 1)) - 1) / 2

 

问题4

python examples/ssd/my_ssd_pascal.py

出现

Check failed: mdb_status == 0 (2 vs. 0) No such file or directory

网上说是路径不对

vim examples/ssd/my_ssd_pascal.py

这个是lmdb数据的路径不对

将其变为绝对路径然后没有报错

问题5,请参照问题6解决

python examples/ssd/my_ssd_pascal.py

然后报错

F0325 16:26:19.637320 35953 math_functions.cpp:250] Check failed: a <= b (0 vs. -1.19209e-07)

网上办法是注释掉 CHECK_LE(a, b),但是这样会出大问题。正确的解决方法是问题6提供的。

如果注释掉 CHECK_LE(a, b) 会出现Data layer prefetch queue empty

不注释CHECK_LE(a, b) 会出现错误 a可能大于b

vim ~/caffe/src/caffe/util/math_functions.cpp

屏蔽250行

//CHECK_LE(a, b);

然后在caffe目录下重新进行py编译

make py -j16

问题6

W0326 22:03:15.797585 20276 upgrade_proto.cpp:72] Note that future Caffe releases will only support input layers and not input fields.
I0326 22:03:15.811280 20276 net.cpp:761] Ignoring source layer drop6
I0326 22:03:15.811954 20276 net.cpp:761] Ignoring source layer drop7
I0326 22:03:15.811962 20276 net.cpp:761] Ignoring source layer fc8
I0326 22:03:15.811964 20276 net.cpp:761] Ignoring source layer prob
I0326 22:03:15.814827 20276 caffe.cpp:251] Starting Optimization
I0326 22:03:15.814855 20276 solver.cpp:294] Solving VGG_VOC0712_Hat_SSD_300x300_train
I0326 22:03:15.814859 20276 solver.cpp:295] Learning Rate Policy: multistep
I0326 22:03:15.819855 20276 blocking_queue.cpp:50] Data layer prefetch queue empty

训练时候出现的问题,参考

https://github.com/weiliu89/caffe/issues/863

vim src/caffe/util/sampler.cpp

111行

添加

// Figure out bbox dimension.
float bbox_width = scale * sqrt(aspect_ratio);
float bbox_height = scale / sqrt(aspect_ratio);
if(bbox_width>=1.0){
bbox_width=1.0;
}
if(bbox_height>=1.0){
bbox_height=1.0;
}
// Figure out top left coordinates.
float w_off, h_off;
caffe_rng_uniform(1, 0.f, 1.0f - bbox_width, &w_off);
caffe_rng_uniform(1, 0.f, 1.0f - bbox_height, &h_off);

修改了c/c++文件,需要重新编译

make clean
make all -j32
make py -j32
make test -j32

4、实验结果

https://blog.csdn.net/jsk_learner/article/details/95451965?depth_1-utm_source=distribute.pc_relevant.none-task&utm_source=distribute.pc_relevant.none-task

  • jobs

日志文件、网络结构等

可以查看里面的训练日志文件,也就是VGG_VOC0712_SSD_300x300.log。查看里面的test,map值

  • results

21个类别的分类信息

  • SSD_300x300

模型保存、网络结构等

linux下caffe-ssd在cuda10+python3.6下编译,训练VOC数据集和测试_第2张图片

5、单张图片测试

5.1文件修改

cd caffe/examples/ssd/
cp ssd_detect.py ssd_detect_voc.py
vim ssd_detect_voc.py

该文件会对测试图片进行类别预测和位置预测,并在图片上进行标注和绘制矩形框。
主要是对代码里的parse_args函数进行一些路径和名称修改,在136行。

    parser = argparse.ArgumentParser()
    parser.add_argument('--gpu_id', type=int, default=2, help='gpu id')
    parser.add_argument('--labelmap_file',
                        default='data/VOC07/labelmap_voc.prototxt')
    parser.add_argument('--model_def',
                        default='models/VGGNet/VOC0712/SSD_300x300/deploy.prototxt')
    parser.add_argument('--image_resize', default=300, type=int)
    parser.add_argument('--model_weights',
                        default='models/VGGNet/VOC0712/SSD_300x300/'
                        'VGG_VOC0712_SSD_300x300_iter_120000.caffemodel')
    parser.add_argument('--image_file', default='examples/images/fish-bike.jpg')
    return parser.parse_args()

5.2 运行

cd ~/caffe 
python examples/ssd/ssd_detect_voc.py

又是一些错误python2与python3 的错误

vim examples/ssd/ssd_detect_voc.py

修改完print和xrange错误后,再次运行出现

Traceback (most recent call last):
  File "examples/ssd/ssd_detect_voc.py", line 149, in 
    main(parse_args())
  File "examples/ssd/ssd_detect_voc.py", line 113, in main
    result = detection.detect(args.image_file)
  File "examples/ssd/ssd_detect_voc.py", line 68, in detect
    image = caffe.io.load_image(image_file)
  File "./python/caffe/io.py", line 296, in load_image
    img = skimage.img_as_float(skimage.io.imread(filename, as_grey=not color)).astype(np.float32)
  File "/usr/local/lib/python3.6/dist-packages/skimage/io/_io.py", line 48, in imread
    img = call_plugin('imread', fname, plugin=plugin, **plugin_args)
  File "/usr/local/lib/python3.6/dist-packages/skimage/io/manage_plugins.py", line 210, in call_plugin
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/skimage/io/_plugins/imageio_plugin.py", line 10, in imread
    return np.asarray(imageio_imread(*args, **kwargs))
  File "/usr/local/lib/python3.6/dist-packages/imageio/core/functions.py", line 265, in imread
    reader = read(uri, format, "i", **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/imageio/core/functions.py", line 186, in get_reader
    return format.get_reader(request)
  File "/usr/local/lib/python3.6/dist-packages/imageio/core/format.py", line 170, in get_reader
    return self.Reader(self, request)
  File "/usr/local/lib/python3.6/dist-packages/imageio/core/format.py", line 221, in __init__
    self._open(**self.request.kwargs.copy())
TypeError: _open() got an unexpected keyword argument 'as_grey'

把caffe.io.load_imga读取图片改成cv2读取:

image = cv2.imread(imagePath)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = image/255

记得添加

import cv2

测试结果在caffe目录下呢。

 

 

 

你可能感兴趣的:(caffe)