官网:github.com/rbgirshick/fast-rcnn
结合博客:blog.csdn.net/samylee/article/details/50965935
注意:Fast RCNN的github代码已经不再进行维护了,更新的工作请看Faster RCNN的网站
但是此处仅仅是为了熟悉一下Fast RCNN
此处先严格按照官网的步骤来:
1、Clone the Fast R-CNN repository
# Make sure to clone with --recursive
git clone --recursive https://github.com/rbgirshick/fast-rcnn.git\
2、Build the Cython modules
cd $FRCN_ROOT/lib
make
3、Build Caffe and pycaffe
cd $FRCN_ROOT/caffe-fast-rcnn
# Now follow the Caffe installation instructions here:
# http://caffe.berkeleyvision.org/installation.html
# If you're experienced with Caffe and have all of the requirements installed
# and your Makefile.config in place, then simply do:
make -j8 && make pycaffe
注意:make -j8 #反正由于cudnn版本的问题,还是会有一些问题,解决方法:blog.csdn.net/u011070171/article/details/52399317中的差不多的替换文件即可类似的去解决
4、Download pre-computed Fast R-CNN detectors
cd $FRCN_ROOT
./data/scripts/fetch_fast_rcnn_models.sh
#下载模型,注意以下命令几乎非常慢,用下载网址:(链接:pan.baidu.com/s/1pJVburD 密码:11m0)可以下载fetch_fast_rcnn_models
downloading...
…
…
看./tools/demo.py
结论:
(1)为了避免不必要的麻烦,什么该放在什么文件夹下都应该遵守,否则容易出错,也浪费时间去找。
(2)果然事先每一张图片的proposals信息都是记录好的,比如说:~/vision/fast-rcnn/data/demo/000004_boxes.mat,每行对应有4个值,应该是位置信息;
(3)scores, boxes = im_detect(net, im, obj_proposals),网络检测结束之后,返回的是 scores, boxes两个张量,解析如下代码:
CONF_THRESH = 0.8 #应该是在非极大值抑制之前,选择要保留下来的框
NMS_THRESH = 0.3 #非极大值抑制的阀值
for cls in classes: #针对每一类做nms
cls_ind = CLASSES.index(cls) #背景对应的 cls_ind = 0
cls_boxes = boxes[:, 4*cls_ind:4*(cls_ind + 1)] #所以boxes的格式应该是N×(4×(k+1)),N是proposals的个数
#下面几行就是非极大值抑制了
cls_scores = scores[:, cls_ind]
keep = np.where(cls_scores >= CONF_THRESH)[0]
cls_boxes = cls_boxes[keep, :]
cls_scores = cls_scores[keep]
dets = np.hstack((cls_boxes,
cls_scores[:, np.newaxis])).astype(np.float32)
keep = nms(dets, NMS_THRESH)
dets = dets[keep, :]
print 'All {} detections with p({} | box) >= {:.1f}'.format(cls, cls,CONF_THRESH)
vis_detections(im, cls, dets, thresh=CONF_THRESH)#可视化置信度大于0.5的框
对于background的那类,至今不知道得出boxes之后是如何处理的
vgg_cnn_m_1024:是VGG16的中等模型,比VGG16小很多
Computing object proposals暂时没看如何使用
Beyond the demo: installation for training and testing models
1、Download the training, validation, test data and VOCdevkit
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
2、Extract all of these tars into one directory named VOCdevkit
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
3、It should have this basic structure
$VOCdevkit/ # development kit
$VOCdevkit/VOCcode/ # VOC utility code
$VOCdevkit/VOC2007 # image sets, annotations, etc.
# ... and several other directories …
4、Create symlinks for the PASCAL VOC dataset
cd $FRCN_ROOT/data
ln -s $VOCdevkit VOCdevkit2007
我是:ln -s /home/echo/vision/Data/VOCdevkit /home/echo/vision/fast-rcnn/data/VOCdevkit2007
注意:Using symlinks is a good idea because you will likely want to share the same PASCAL dataset installation between multiple projects.
Download pre-computed Selective Search object proposals
cd $FRCN_ROOT
./data/scripts/fetch_selective_search_data.sh
This will populate the $FRCN_ROOT/data folder with selective_selective_data.
Download pre-trained ImageNet models
Pre-trained ImageNet models can be downloaded for the three networks described in the paper: CaffeNet (model S), VGG_CNN_M_1024 (model M), and VGG16 (model L).
cd $FRCN_ROOT
./data/scripts/fetch_imagenet_models.sh
These models are all available in the Caffe Model Zoo, but are provided here for your convenience.
#注意:以上都不是很好下载,就算不下载,Selective Search object proposals就是VOC图片集的预先生成的proposals而已, ImageNet models就是之前预先的分类网络。继续看用法:
Usage
Train a Fast R-CNN detector. For example, train a VGG16 network on VOC 2007 trainval:
./tools/train_net.py --gpu 0 --solver models/VGG16/solver.prototxt --weights data/imagenet_models/VGG16.v2.caffemodel
看./tools/train_net.py,其中的/home/echo/vision/fast-rcnn/lib/fast_rcnn/config.py中就是对各种关于网络的参数的设计
train_net.py 中,imdb = get_imdb(args.imdb_name)#不是很懂,哦~原来上面有default='voc_2007_trainval', type=str,眼睛太小,之前没有看见呀~~~但是还是不知道数据的路径设置在哪
…
...巴拉...巴拉
…
后面一直往下看,也没啥特别的,估计自己不做,是找不出具体问题的,做起来——》
接着飞机的数据往下做:
其实所有的,我只需要去替换框架中的数据即可,这样会很简单。人家什么样子,我就提供什么样子的数据即可呀!但是这太肤浅了吧~~