基于python+caffe的faster rcnn训练识别

clone整个工程,内嵌工程内的caffe:
git clone –recursive https://github.com/rbgirshick/py-faster-rcnn.git, 设置工程根目录为$FRCN_ROOT=clone 根目录
Build the Cython modules: cd $FRCN_ROOT/lib, make
build caffe
- cd $FRCN_ROOT/caffe-fast-rcnn 创建make file:cp Makefile.config.example Makefile.config
- 修改Makefile.config:开启OPENCV_VERSION := 3, WITH_PYTHON_LAYER := 1;
- 修改USE_OPENCV := 1,INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/,LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial/ /usr/local/lib/lib/ /usr/lib/x86_64-linux-gnu/(没有添加会报错no hdf5.h found)
  其中有一个点比较奇怪: 原生的caffe USE_CUDNN := 1 build是ok的, py faster内嵌的caffe版本同样的环境开启USE_CUDNN := 1就会编译失败, 后面训练+识别都用到了gpu, 这个版本的Makefile.config默认就是注释掉的,所以无伤大雅
  附上配置

cat $FRCN_ROOT/caffe-fast-rcnn/Makefile.config
## Refer to http://caffe.berkeleyvision.org/installation.html
# Contributions simplifying and improving our build system are welcome!

# cuDNN acceleration switch (uncomment to build with cuDNN).
# USE_CUDNN := 1

# CPU-only switch (uncomment to build without GPU support).
# CPU_ONLY := 1

# uncomment to disable IO dependencies and corresponding data layers
USE_OPENCV := 1
# USE_LEVELDB := 0
USE_LMDB := 1

# uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary)
#   You should not set this flag if you will be reading LMDBs with any
#   possibility of simultaneous read and write
# ALLOW_LMDB_NOLOCK := 1

# Uncomment if you're using OpenCV 3
OPENCV_VERSION := 3

# To customize your choice of compiler, uncomment and set the following.
# N.B. the default for Linux is g++ and the default for OSX is clang++
# CUSTOM_CXX := g++

# CUDA directory contains bin/ and lib/ directories that we need.
CUDA_DIR := /usr/local/cuda
# On Ubuntu 14.04, if cuda tools are installed via
# "sudo apt-get install nvidia-cuda-toolkit" then use this instead:
# CUDA_DIR := /usr

# CUDA architecture setting: going with all of them.
# For CUDA < 6.0, comment the *_50 lines for compatibility.
CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \
        -gencode arch=compute_20,code=sm_21 \
        -gencode arch=compute_30,code=sm_30 \
        -gencode arch=compute_35,code=sm_35 \
        -gencode arch=compute_50,code=sm_50 \
        -gencode arch=compute_50,code=compute_50

# BLAS choice:
# atlas for ATLAS (default)
# mkl for MKL
# open for OpenBlas
BLAS := atlas
# Custom (MKL/ATLAS/OpenBLAS) include and lib directories.
# Leave commented to accept the defaults for your choice of BLAS
# (which should work)!
# BLAS_INCLUDE := /path/to/your/blas
# BLAS_LIB := /path/to/your/blas

# Homebrew puts openblas in a directory that is not on the standard search path
# BLAS_INCLUDE := $(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib

# This is required only if you will compile the matlab interface.
# MATLAB directory should contain the mex binary in /bin.
# MATLAB_DIR := /usr/local
# MATLAB_DIR := /Applications/MATLAB_R2012b.app

# NOTE: this is required only if you will compile the python interface.
# We need to be able to find Python.h and numpy/arrayobject.h.
PYTHON_INCLUDE := /usr/include/python2.7 \
        /usr/lib/python2.7/dist-packages/numpy/core/include
# Anaconda Python distribution is quite popular. Include path:
# Verify anaconda location, sometimes it's in root.
# ANACONDA_HOME := $(HOME)/anaconda
# PYTHON_INCLUDE := $(ANACONDA_HOME)/include \
        # $(ANACONDA_HOME)/include/python2.7 \
        # $(ANACONDA_HOME)/lib/python2.7/site-packages/numpy/core/include \

# Uncomment to use Python 3 (default is Python 2)
# PYTHON_LIBRARIES := boost_python3 python3.5m
# PYTHON_INCLUDE := /usr/include/python3.5m \
#                 /usr/lib/python3.5/dist-packages/numpy/core/include

# We need to be able to find libpythonX.X.so or .dylib.
PYTHON_LIB := /usr/lib
# PYTHON_LIB := $(ANACONDA_HOME)/lib

# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir $(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib

# Uncomment to support layers written in Python (will link against Python libs)
 WITH_PYTHON_LAYER := 1

# Whatever else you find you need goes here.
INCLUDE_DIRS := $(PYTHON_INCLUDE) /usr/local/include /usr/include/hdf5/serial/ /usr/include/opencv/ /usr/include/opencv2/
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib /usr/lib/x86_64-linux-gnu/hdf5/serial/  /usr/local/lib/lib/ /usr/lib/x86_64-linux-gnu/

# If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies
# INCLUDE_DIRS += $(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib

# Uncomment to use `pkg-config` to specify OpenCV library paths.
# (Usually not necessary -- OpenCV libraries are normally installed in one of the above $LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1

BUILD_DIR := build
DISTRIBUTE_DIR := distribute

# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
# DEBUG := 1

# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0

# enable pretty build (comment to see full commands)
Q ?= @

配置好了之后make all -j12 && make pycaffe

构建数据集结构: cd $FRCN_ROOT/data, 数据目录结构如下:

下载vgg16的模型参数到下:./data/scripts/fetch_imagenet_models.sh, 这个脚本是从dropbox下的, 但是dropbox访问非常慢,可以从网上找然后放到$FRCN_ROOT/data/imagenet_models下
调试过程中可以使用原生的数据集,运行命令

./experiments/scripts/faster_rcnn_alt_opt.sh [GPU_ID] [NET] [--set ...]
# GPU_ID is the GPU you want to train on
# NET in {ZF, VGG_CNN_M_1024, VGG16} is the network arch to use
# --set ... allows you to specify fast_rcnn.config options, e.g.
#   --set EXP_DIR seed_rng1701 RNG_SEED 1701
比如:nohup ./experiments/scripts/faster_rcnn_end2end.sh 0 VGG16 pascal_voc 2>&1 &

但是启动可能会遇到如下一些问题, 此处都是其他同事告诉我的(https://www.douban.com/note/630834963/?type=rec#sep):

AttributeError: 'module' object has no attribute ‘text_format'
解决方法：在/home/xxx/py-faster-rcnn/lib/fast_rcnn/train.py的头文件导入部分加上：import google.protobuf.text_format
TypeError: 'numpy.float64' object cannot be interpreted as an index

这里是因为numpy版本不兼容导致的问题，最好的解决办法是卸载你的numpy，安装numpy1.11.0, 但是重新装numpy1.11.0又和tensorflow冲突
就修改代码兼容, 所以这里最好用docker做环境, 相互隔离:

$FRCN_ROOT/lib/roi_data_layer/minibatch.py

将第26行：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
改为：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)

$FRCN_ROOT/lib/datasets/ds_utils.py

将第12行：hashes = np.round(boxes * scale).dot(v)
改为：hashes = np.round(boxes * scale).dot(v).astype(np.int)

$FRCN_ROOT/lib/fast_rcnn/test.py

将第129行： hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v)
改为： hashes = np.round(blobs['rois'] * cfg.DEDUP_BOXES).dot(v).astype(np.int)

$FRCN_ROOT/lib/rpn/proposal_target_layer.py

将第60行：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image)
改为：fg_rois_per_image = np.round(cfg.TRAIN.FG_FRACTION * rois_per_image).astype(np.int)

TypeError: slice indices must be integers or None or have an index method
修改 /home/lzx/py-faster-rcnn/lib/rpn/proposal_target_layer.py，转到123行：

for ind in inds:
cls = clss[ind]
start = 4 * cls
end = start + 4
bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
return bbox_targets, bbox_inside_weights

这里的ind，start，end都是 numpy.int 类型，这种类型的数据不能作为索引，所以必须对其进行强制类型转换，转化结果如下：

for ind in inds:
ind = int(ind)
cls = clss[ind]
start = int(4 * cos)
end = int(start + 4)
bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]
bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS
return bbox_targets, bbox_inside_weights

不排除build 还有其他问题, 需要分析Google解决
看到原生的数据集可以开始训练环境setup基本就可以了

2.标注数据

git clone https://github.com/tzutalin/labelImg.git
我是在mac上标注,按照教程setup 没有遇到问题
Python 3 + Qt5 (Works on macOS High Sierra)

brew install qt  # will install qt-5.x.x
brew install libxml2
make qt5py3
python labelImg.py
python  labelImg.py [IMAGE_PATH] [PRE-DEFINED CLASS FILE]

在data/predefined_classes.txt下提前写入好的分类的label,设置输出目录

然后框图选择对应label,会在输入目录生成对应图的xml文件标识分类

3.训练和识别

训练
- 当label数量和种类发生变化时需要清除cache文件: rm -rf data/cache, 否则训练启动时包box框的assert错误
- xml文件放入$FRCN_ROOT/data/VOCdevkit2007/VOC2007/Annotations
- jpg文件放入$FRCN_ROOT/data/VOCdevkit2007/VOC2007/JPEGImages
- list.py生成trainval.txt放入ImageSets/Main下
```
cd   $FRCN_ROOT/data/VOCdevkit2007/  && python list.py VOC2007/ 
mv trainval.txt ImageSets/Main/
```
- 根据训练的类别与种类修改lib下datasets下pascal_voc.py文件， self._classes =（''）例如五个分类+背景:
```
        self._classes = ('__background__', # always index 0
                       '00001', '00002', '00003','00004', '00005')
```
- 以vgg16为例修改$FRCN_ROOT/models/pascal_voc/VGG16/faster_rcnn_end2end/train.prototxt ,找到input_data修改num_classes = 6 (五个分类+背景)
  
  找到roi-data下的python_param修改num_classes = 6
  
  找到cls_score下inner_product_param的num_output, 注意cls_score有多处,找到name和top都是'cls_score'的那个
  
  同理修改bbox_pred下的num_output为24(label数目*4)
- 同样道理修改
  
  $FRCN_ROOT/models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt test.prototxt中没有input-data层，所以只需按照train中的修改cls_score层以及bbox_pred层即可

开始训练nohup ./experiments/scripts/faster_rcnn_end2end.sh 0 VGG16 pascal_voc 2>&1 & 看到solving就是开始训练了

  Solving...
  I0125 16:44:55.569564 29848 solver.cpp:229] Iteration 0, loss = 3.72803
  I0125 16:44:55.569602 29848 solver.cpp:245]     Train net output #0: loss_bbox = 0.0788447 (* 1 = 0.0788447 loss)
  I0125 16:44:55.569607 29848 solver.cpp:245]     Train net output #1: loss_cls = 2.51337 (* 1 = 2.51337 loss)
  I0125 16:44:55.569612 29848 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.77974 (* 1 = 0.77974 loss)
  I0125 16:44:55.569614 29848 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0.10321 (* 1 = 0.10321 loss)
  I0125 16:44:55.569618 29848 sgd_solver.cpp:106] Iteration 0, lr = 0.001
  I0125 16:45:05.665946 29848 solver.cpp:229] Iteration 20, loss = 1.29099
  I0125 16:45:05.665966 29848 solver.cpp:245]     Train net output #0: loss_bbox = 0.387312 (* 1 = 0.387312 loss)
  I0125 16:45:05.665971 29848 solver.cpp:245]     Train net output #1: loss_cls = 0.453434 (* 1 = 0.453434 loss)
  I0125 16:45:05.665974 29848 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.119676 (* 1 = 0.119676 loss)
  I0125 16:45:05.665992 29848 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0.0238676 (* 1 = 0.0238676 loss)
  I0125 16:45:05.665995 29848 sgd_solver.cpp:106] Iteration 20, lr = 0.001
  I0125 16:45:15.740180 29848 solver.cpp:229] Iteration 40, loss = 1.198
  I0125 16:45:15.740201 29848 solver.cpp:245]     Train net output #0: loss_bbox = 0.735388 (* 1 = 0.735388 loss)
  I0125 16:45:15.740221 29848 solver.cpp:245]     Train net output #1: loss_cls = 0.691414 (* 1 = 0.691414 loss)
  I0125 16:45:15.740224 29848 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.055582 (* 1 = 0.055582 loss)
  I0125 16:45:15.740228 29848 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0.0227136 (* 1 = 0.0227136 loss)
  I0125 16:45:15.740231 29848 sgd_solver.cpp:106] Iteration 40, lr = 0.001
  I0125 16:45:25.829916 29848 solver.cpp:229] Iteration 60, loss = 0.757013
  I0125 16:45:25.829936 29848 solver.cpp:245]     Train net output #0: loss_bbox = 0.599626 (* 1 = 0.599626 loss)
  I0125 16:45:25.829939 29848 solver.cpp:245]     Train net output #1: loss_cls = 0.348519 (* 1 = 0.348519 loss)
  I0125 16:45:25.829942 29848 solver.cpp:245]     Train net output #2: rpn_cls_loss = 0.0327966 (* 1 = 0.0327966 loss)
  I0125 16:45:25.829960 29848 solver.cpp:245]     Train net output #3: rpn_loss_bbox = 0.0138469 (* 1 = 0.0138469 loss)
  I0125 16:45:25.829963 29848 sgd_solver.cpp:106] Iteration 60, lr = 0.001

可以修改$FRCN_ROOT/lib/fast_rcnn/config.py 控制保存model间隔, 尤其是调试的时候设置小一点快速发现问题, __C.TRAIN.SNAPSHOT_ITERS = 10000

识别&测试
训练好的模型在output/faster_rcnn_end2end/voc_2007_trainval/下

    db24gt@db24gt-XPS-8930:~/workspace/github/py-faster-rcnn$ ls output/faster_rcnn_end2end/voc_2007_trainval/vgg16_faster_rcnn_iter_*
    output/faster_rcnn_end2end/voc_2007_trainval/vgg16_faster_rcnn_iter_10000.caffemodel  output/faster_rcnn_end2end/voc_2007_trainval/     vgg16_faster_rcnn_iter_50000.caffemodel
    output/faster_rcnn_end2end/voc_2007_trainval/vgg16_faster_rcnn_iter_20000.caffemodel  output/faster_rcnn_end2end/voc_2007_trainval/     vgg16_faster_rcnn_iter_60000.caffemodel
    output/faster_rcnn_end2end/voc_2007_trainval/vgg16_faster_rcnn_iter_30000.caffemodel  output/faster_rcnn_end2end/voc_2007_trainval/     vgg16_faster_rcnn_iter_70000.caffemodel
    output/faster_rcnn_end2end/voc_2007_trainval/vgg16_faster_rcnn_iter_40000.caffemodel

测试代码样例
加入环境变量可以找到python库, 我写的是本机绝对路径

    """Set up paths for Fast R-CNN."""
    
    import os.path as osp
    import sys
    
    def add_path(path):
        if path not in sys.path:
            sys.path.insert(0, path)
    
    #this_dir = osp.dirname(__file__)
    
    # Add caffe to PYTHONPATH
    caffe_path = osp.join('/home/db24gt/workspace/github/py-faster-rcnn/caffe-fast-rcnn/python')
    #caffe_path = osp.join(this_dir, '..', 'caffe-ssd', 'python')
    add_path(caffe_path)
    
    # Add lib to PYTHONPATH
    lib_path = osp.join('/home/db24gt/workspace/github/py-faster-rcnn/lib')
    add_path(lib_path)

            #!/usr/bin/env python
        
        import _init_paths
        import json
        import caffe
        import matplotlib.pyplot as plt
        import numpy as np
        import os
        from fast_rcnn.config import cfg
        from fast_rcnn.nms_wrapper import nms
        from fast_rcnn.test import im_detect
        from utils.timer import Timer
        import cv2
        
        CLASSES = (
        '__background__', '00001-RED_TEA', '00002-N_QUICK_LINE', '00003-HEART_WATER', '00004-LAYS', '00005-ICE_SUGAR')
        
        cfg.TEST.HAS_RPN = True  # Use RPN for proposals
        
        
        demo_net = 'vgg16'
        prototxt = '/home/db24gt/workspace/github/py-faster-rcnn/models/pascal_voc/VGG16/faster_rcnn_end2end/test.prototxt'
        caffemodel = '/home/db24gt/workspace/github/py-faster-rcnn/output/faster_rcnn_end2end/voc_2007_trainval/            vgg16_faster_rcnn_iter_70000.caffemodel'
        
        if not os.path.isfile(caffemodel):
            raise IOError(('{:s} not found.\nDid you run ./data/script/'
                           'fetch_faster_rcnn_models.sh?').format(caffemodel))
        
        gpu_id = 0
        caffe.set_mode_gpu()
        caffe.set_device(0)
        cfg.GPU_ID = 0
        net = caffe.Net(prototxt, caffemodel, caffe.TEST)
        
        def output_csv(class_name, dets, result_dict, thresh=0.5):
            inds = np.where(dets[:, -1] >= thresh)[0]
            if len(inds) == 0:
                return
            sku = []
            for i in inds:
                bbox = dets[i, :4]
                score = dets[i, -1]
                # print class_name,bbox[0],bbox[1],bbox[2],bbox[3]
                sku = [class_name, bbox[0], bbox[1], bbox[2], bbox[3], score]
                result_dict.append(sku)
        
        
        def detect(im, THRESHOLD=0.5):
            # Detect all object classes and regress object bounds
            caffe.set_mode_gpu()
            caffe.set_device(0)
            timer = Timer()
            timer.tic()
            # im = cv2.imread(im_name)
            scores, boxes = im_detect(net, im)
            timer.toc()
            print ('Detection took {:.3f}s for {:d} object proposals').format(timer.total_time, boxes.shape[0])
            # Visualize detections for each class
            CONF_THRESH = THRESHOLD
            NMS_THRESH = 0.2
            im = im[:, :, (2, 1, 0)]
            # fig, ax = plt.subplots(figsize=(12, 12))
            # ax.imshow(im, aspect='equal')
            result_dict = []
            for cls_ind, cls in enumerate(CLASSES[1:]):
                cls_ind += 1  # because we skipped background
                cls_boxes = boxes[:, 4 * cls_ind:4 * (cls_ind + 1)]
                cls_scores = scores[:, cls_ind]
                dets = np.hstack((cls_boxes,
                                  cls_scores[:, np.newaxis])).astype(np.float32)
                keep = nms(dets, NMS_THRESH)
                dets = dets[keep, :]
                output_csv(cls, dets, result_dict, thresh=CONF_THRESH)
        
            return result_dict
        
        vc = cv2.VideoCapture(0)
        i = 0
        while True:
            r, f = vc.read()
            height, width, channels = f.shape
            roi = f[0:height, 0:width/2]
            resize = cv2.resize(roi, (roi.shape[1]/2,roi.shape[0]/2), interpolation=cv2.INTER_CUBIC)
            res = detect(resize)
        
            if len(res) > 0:
                print ('detect  %s' % (res[0]))
                left = ( res[0][1],res[0][2])
                right = ( res[0][3], res[0][4])
                cv2.rectangle(resize,left, right ,(55,255,155),2)
                cv2.imwrite(str(i) + '.jpg', resize)
                i += 1
            cv2.imshow('x', resize)
            k = cv2.waitKey(5)
            if k == ord('q'):
                vc.release()

实验效果:
训练集合5个品类,3600个图片,简单测试一下准确召回尚可
测了几十次都识别对了

基于python+caffe的faster rcnn训练识别

目录

1.环境setup

2.标注数据

3.训练和识别

你可能感兴趣的:(基于python+caffe的faster rcnn训练识别)