这是caffe文档中Notebook Examples的倒数第二个例子,链接地址http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/detection.ipynb
这个例子用R-CNN做目标检测。
R-CNN是一个先进的目标检测模型,它通过微调caffe模型提供分类区域。对于R-CNN系统和模型的详细介绍,参考
Rich feature hierarchies for accurate object detection and semantic segmentation. Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik. CVPR 2014.Arxiv 2013.
在本例中,预训练模型基于ImageNet数据集,在并在ILSVRC13上进行微调,输出200个检测分类的得分。需要注意的是:本例中一个原始数据对应所有SVM分类的得分,没有概率校准和类间比较。本例中使用现成的模型只是为了简便,并非完整的R-CNN模型
现在,来检测图caffe-master/examples/images/fish-bike.jpg
首先,需要做一些准备工作:
i. 安装matlab, 具体安装过程参考:http://blog.csdn.net/thystar/article/details/50720691
ii. 添加matlab安装路径,sudo gedit ~/.bashrc, 在文本最后添加:export PATH="/home/sindyz/software/matlab2014/bin":$PATH (我的安装路径)保存后需要重启电脑,要说明的是,这一步是必要的,否则运行时会出现:OSError: [Errno 2] No such file or directory错误。
iii. 下载Selective Search文件,下载地址:https://github.com/sergeyk/selective_search_ijcv_with_python,用于检测候选框,关于Selective Search的算法介绍,参考:http://koen.me/research/selectivesearch/,下载完成后,解压,在matlab下运行demo.m, 无报错信息关闭即可,需要注意的是,如果这个文件不在$CAFFE-ROOT/python目录下,需要将其添加到PYTHONPATH路径中,我的是:export PYTHONPATH=/home/sindyz/code/matlabCode/selective_search_ijcv_with_python/:$PYTHONPATH。(按自己的情况添加)
完成上述步骤后,还有几处需要注意和修改的地方:
predictions = out[self.outputs[0]].squeeze(axis=(2, 3)) 改为
predictions = out[self.outputs[0]].squeeze(),否则会报出ValueError: 'axis' entry 2 is out of bounds (-2, 2)错误
import selective_search_ijcv_with_python as selective_search
改为import selective_search,因为在Selective Search文件目录下,只有selective_search.py模块,否则会出现模块找不到的错误
OK,现在可以开心的运行R-CNN这个例子了。
1. 更改目录,导入相应的包
import os
caffe_root = '/home/sindyz/caffe-master/'
os.chdir(caffe_root)
import sys
sys.path.insert(0,'./python')
2. 创建临时目录,导入检测样本
! mkdir -p _temp
! echo examples/images/fish-bike.jpg > _temp/det_input.txt
! python/detect.py --crop_mode=selective_search --pretrained_model=models/bvlc_reference_rcnn_ilsvrc13/bvlc_reference_rcnn_ilsvrc13.caffemodel --model_def=models/bvlc_reference_rcnn_ilsvrc13/deploy.prototxt --gpu --raw_scale=255 _temp/det_input.txt _temp/det_output.h5
输出如下内容:
GPU mode
WARNING: Logging before InitGoogleLogging() is written to STDERR
I0608 10:32:38.067106 6131 net.cpp:42] Initializing net from parameters:
name: "R-CNN-ilsvrc13"
input: "data"
input_dim: 10
input_dim: 3
input_dim: 227
input_dim: 227
state {
phase: TEST
}
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "pool1"
type: "Pooling"
bottom: "conv1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm1"
type: "LRN"
bottom: "pool1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "norm1"
top: "conv2"
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "pool2"
type: "Pooling"
bottom: "conv2"
top: "pool2"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "norm2"
type: "LRN"
bottom: "pool2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "fc6"
type: "InnerProduct"
bottom: "pool5"
top: "fc6"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc6"
top: "fc6"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc6"
top: "fc6"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc7"
type: "InnerProduct"
bottom: "fc6"
top: "fc7"
inner_product_param {
num_output: 4096
}
}
layer {
name: "relu7"
type: "ReLU"
bottom: "fc7"
top: "fc7"
}
layer {
name: "drop7"
type: "Dropout"
bottom: "fc7"
top: "fc7"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc-rcnn"
type: "InnerProduct"
bottom: "fc7"
top: "fc-rcnn"
inner_product_param {
num_output: 200
}
}
I0608 10:32:38.067556 6131 net.cpp:370] Input 0 -> data
I0608 10:32:38.067576 6131 layer_factory.hpp:74] Creating layer conv1
I0608 10:32:38.067585 6131 net.cpp:90] Creating Layer conv1
I0608 10:32:38.067589 6131 net.cpp:410] conv1 <- data
I0608 10:32:38.067595 6131 net.cpp:368] conv1 -> conv1
I0608 10:32:38.067603 6131 net.cpp:120] Setting up conv1
I0608 10:32:38.108999 6131 net.cpp:127] Top shape: 10 96 55 55 (2904000)
I0608 10:32:38.109035 6131 layer_factory.hpp:74] Creating layer relu1
I0608 10:32:38.109048 6131 net.cpp:90] Creating Layer relu1
I0608 10:32:38.109055 6131 net.cpp:410] relu1 <- conv1
I0608 10:32:38.109063 6131 net.cpp:357] relu1 -> conv1 (in-place)
I0608 10:32:38.109076 6131 net.cpp:120] Setting up relu1
I0608 10:32:38.109233 6131 net.cpp:127] Top shape: 10 96 55 55 (2904000)
I0608 10:32:38.109244 6131 layer_factory.hpp:74] Creating layer pool1
I0608 10:32:38.109257 6131 net.cpp:90] Creating Layer pool1
I0608 10:32:38.109263 6131 net.cpp:410] pool1 <- conv1
I0608 10:32:38.109269 6131 net.cpp:368] pool1 -> pool1
I0608 10:32:38.109277 6131 net.cpp:120] Setting up pool1
I0608 10:32:38.109311 6131 net.cpp:127] Top shape: 10 96 27 27 (699840)
I0608 10:32:38.109318 6131 layer_factory.hpp:74] Creating layer norm1
I0608 10:32:38.109325 6131 net.cpp:90] Creating Layer norm1
I0608 10:32:38.109329 6131 net.cpp:410] norm1 <- pool1
I0608 10:32:38.109335 6131 net.cpp:368] norm1 -> norm1
I0608 10:32:38.109341 6131 net.cpp:120] Setting up norm1
I0608 10:32:38.109349 6131 net.cpp:127] Top shape: 10 96 27 27 (699840)
I0608 10:32:38.109352 6131 layer_factory.hpp:74] Creating layer conv2
I0608 10:32:38.109360 6131 net.cpp:90] Creating Layer conv2
I0608 10:32:38.109364 6131 net.cpp:410] conv2 <- norm1
I0608 10:32:38.109370 6131 net.cpp:368] conv2 -> conv2
I0608 10:32:38.109376 6131 net.cpp:120] Setting up conv2
I0608 10:32:38.109931 6131 net.cpp:127] Top shape: 10 256 27 27 (1866240)
I0608 10:32:38.109947 6131 layer_factory.hpp:74] Creating layer relu2
I0608 10:32:38.109954 6131 net.cpp:90] Creating Layer relu2
I0608 10:32:38.109959 6131 net.cpp:410] relu2 <- conv2
I0608 10:32:38.109966 6131 net.cpp:357] relu2 -> conv2 (in-place)
I0608 10:32:38.109972 6131 net.cpp:120] Setting up relu2
I0608 10:32:38.110002 6131 net.cpp:127] Top shape: 10 256 27 27 (1866240)
I0608 10:32:38.110008 6131 layer_factory.hpp:74] Creating layer pool2
I0608 10:32:38.110014 6131 net.cpp:90] Creating Layer pool2
I0608 10:32:38.110018 6131 net.cpp:410] pool2 <- conv2
I0608 10:32:38.110024 6131 net.cpp:368] pool2 -> pool2
I0608 10:32:38.110030 6131 net.cpp:120] Setting up pool2
I0608 10:32:38.110136 6131 net.cpp:127] Top shape: 10 256 13 13 (432640)
I0608 10:32:38.110144 6131 layer_factory.hpp:74] Creating layer norm2
I0608 10:32:38.110152 6131 net.cpp:90] Creating Layer norm2
I0608 10:32:38.110157 6131 net.cpp:410] norm2 <- pool2
I0608 10:32:38.110162 6131 net.cpp:368] norm2 -> norm2
I0608 10:32:38.110168 6131 net.cpp:120] Setting up norm2
I0608 10:32:38.110175 6131 net.cpp:127] Top shape: 10 256 13 13 (432640)
I0608 10:32:38.110179 6131 layer_factory.hpp:74] Creating layer conv3
I0608 10:32:38.110187 6131 net.cpp:90] Creating Layer conv3
I0608 10:32:38.110191 6131 net.cpp:410] conv3 <- norm2
I0608 10:32:38.110198 6131 net.cpp:368] conv3 -> conv3
I0608 10:32:38.110203 6131 net.cpp:120] Setting up conv3
I0608 10:32:38.111160 6131 net.cpp:127] Top shape: 10 384 13 13 (648960)
I0608 10:32:38.111176 6131 layer_factory.hpp:74] Creating layer relu3
I0608 10:32:38.111183 6131 net.cpp:90] Creating Layer relu3
I0608 10:32:38.111189 6131 net.cpp:410] relu3 <- conv3
I0608 10:32:38.111194 6131 net.cpp:357] relu3 -> conv3 (in-place)
I0608 10:32:38.111202 6131 net.cpp:120] Setting up relu3
I0608 10:32:38.111232 6131 net.cpp:127] Top shape: 10 384 13 13 (648960)
I0608 10:32:38.111238 6131 layer_factory.hpp:74] Creating layer conv4
I0608 10:32:38.111243 6131 net.cpp:90] Creating Layer conv4
I0608 10:32:38.111248 6131 net.cpp:410] conv4 <- conv3
I0608 10:32:38.111253 6131 net.cpp:368] conv4 -> conv4
I0608 10:32:38.111260 6131 net.cpp:120] Setting up conv4
I0608 10:32:38.112344 6131 net.cpp:127] Top shape: 10 384 13 13 (648960)
I0608 10:32:38.112357 6131 layer_factory.hpp:74] Creating layer relu4
I0608 10:32:38.112365 6131 net.cpp:90] Creating Layer relu4
I0608 10:32:38.112370 6131 net.cpp:410] relu4 <- conv4
I0608 10:32:38.112375 6131 net.cpp:357] relu4 -> conv4 (in-place)
I0608 10:32:38.112381 6131 net.cpp:120] Setting up relu4
I0608 10:32:38.112411 6131 net.cpp:127] Top shape: 10 384 13 13 (648960)
I0608 10:32:38.112416 6131 layer_factory.hpp:74] Creating layer conv5
I0608 10:32:38.112422 6131 net.cpp:90] Creating Layer conv5
I0608 10:32:38.112427 6131 net.cpp:410] conv5 <- conv4
I0608 10:32:38.112432 6131 net.cpp:368] conv5 -> conv5
I0608 10:32:38.112439 6131 net.cpp:120] Setting up conv5
I0608 10:32:38.113263 6131 net.cpp:127] Top shape: 10 256 13 13 (432640)
I0608 10:32:38.113279 6131 layer_factory.hpp:74] Creating layer relu5
I0608 10:32:38.113286 6131 net.cpp:90] Creating Layer relu5
I0608 10:32:38.113291 6131 net.cpp:410] relu5 <- conv5
I0608 10:32:38.113297 6131 net.cpp:357] relu5 -> conv5 (in-place)
I0608 10:32:38.113303 6131 net.cpp:120] Setting up relu5
I0608 10:32:38.113333 6131 net.cpp:127] Top shape: 10 256 13 13 (432640)
I0608 10:32:38.113339 6131 layer_factory.hpp:74] Creating layer pool5
I0608 10:32:38.113347 6131 net.cpp:90] Creating Layer pool5
I0608 10:32:38.113350 6131 net.cpp:410] pool5 <- conv5
I0608 10:32:38.113356 6131 net.cpp:368] pool5 -> pool5
I0608 10:32:38.113363 6131 net.cpp:120] Setting up pool5
I0608 10:32:38.113502 6131 net.cpp:127] Top shape: 10 256 6 6 (92160)
I0608 10:32:38.113520 6131 layer_factory.hpp:74] Creating layer fc6
I0608 10:32:38.113528 6131 net.cpp:90] Creating Layer fc6
I0608 10:32:38.113533 6131 net.cpp:410] fc6 <- pool5
I0608 10:32:38.113538 6131 net.cpp:368] fc6 -> fc6
I0608 10:32:38.113545 6131 net.cpp:120] Setting up fc6
I0608 10:32:38.140440 6131 net.cpp:127] Top shape: 10 4096 (40960)
I0608 10:32:38.140478 6131 layer_factory.hpp:74] Creating layer relu6
I0608 10:32:38.140492 6131 net.cpp:90] Creating Layer relu6
I0608 10:32:38.140498 6131 net.cpp:410] relu6 <- fc6
I0608 10:32:38.140506 6131 net.cpp:357] relu6 -> fc6 (in-place)
I0608 10:32:38.140516 6131 net.cpp:120] Setting up relu6
I0608 10:32:38.140576 6131 net.cpp:127] Top shape: 10 4096 (40960)
I0608 10:32:38.140583 6131 layer_factory.hpp:74] Creating layer drop6
I0608 10:32:38.140589 6131 net.cpp:90] Creating Layer drop6
I0608 10:32:38.140594 6131 net.cpp:410] drop6 <- fc6
I0608 10:32:38.140599 6131 net.cpp:357] drop6 -> fc6 (in-place)
I0608 10:32:38.140605 6131 net.cpp:120] Setting up drop6
I0608 10:32:38.140611 6131 net.cpp:127] Top shape: 10 4096 (40960)
I0608 10:32:38.140616 6131 layer_factory.hpp:74] Creating layer fc7
I0608 10:32:38.140622 6131 net.cpp:90] Creating Layer fc7
I0608 10:32:38.140630 6131 net.cpp:410] fc7 <- fc6
I0608 10:32:38.140636 6131 net.cpp:368] fc7 -> fc7
I0608 10:32:38.140643 6131 net.cpp:120] Setting up fc7
I0608 10:32:38.153045 6131 net.cpp:127] Top shape: 10 4096 (40960)
I0608 10:32:38.153095 6131 layer_factory.hpp:74] Creating layer relu7
I0608 10:32:38.153105 6131 net.cpp:90] Creating Layer relu7
I0608 10:32:38.153112 6131 net.cpp:410] relu7 <- fc7
I0608 10:32:38.153120 6131 net.cpp:357] relu7 -> fc7 (in-place)
I0608 10:32:38.153129 6131 net.cpp:120] Setting up relu7
I0608 10:32:38.153200 6131 net.cpp:127] Top shape: 10 4096 (40960)
I0608 10:32:38.153206 6131 layer_factory.hpp:74] Creating layer drop7
I0608 10:32:38.153214 6131 net.cpp:90] Creating Layer drop7
I0608 10:32:38.153219 6131 net.cpp:410] drop7 <- fc7
I0608 10:32:38.153224 6131 net.cpp:357] drop7 -> fc7 (in-place)
I0608 10:32:38.153231 6131 net.cpp:120] Setting up drop7
I0608 10:32:38.153237 6131 net.cpp:127] Top shape: 10 4096 (40960)
I0608 10:32:38.153242 6131 layer_factory.hpp:74] Creating layer fc-rcnn
I0608 10:32:38.153249 6131 net.cpp:90] Creating Layer fc-rcnn
I0608 10:32:38.153254 6131 net.cpp:410] fc-rcnn <- fc7
I0608 10:32:38.153259 6131 net.cpp:368] fc-rcnn -> fc-rcnn
I0608 10:32:38.153267 6131 net.cpp:120] Setting up fc-rcnn
I0608 10:32:38.154058 6131 net.cpp:127] Top shape: 10 200 (2000)
I0608 10:32:38.154080 6131 net.cpp:194] fc-rcnn does not need backward computation.
I0608 10:32:38.154085 6131 net.cpp:194] drop7 does not need backward computation.
I0608 10:32:38.154090 6131 net.cpp:194] relu7 does not need backward computation.
I0608 10:32:38.154095 6131 net.cpp:194] fc7 does not need backward computation.
I0608 10:32:38.154100 6131 net.cpp:194] drop6 does not need backward computation.
I0608 10:32:38.154105 6131 net.cpp:194] relu6 does not need backward computation.
I0608 10:32:38.154110 6131 net.cpp:194] fc6 does not need backward computation.
I0608 10:32:38.154115 6131 net.cpp:194] pool5 does not need backward computation.
I0608 10:32:38.154129 6131 net.cpp:194] relu5 does not need backward computation.
I0608 10:32:38.154134 6131 net.cpp:194] conv5 does not need backward computation.
I0608 10:32:38.154139 6131 net.cpp:194] relu4 does not need backward computation.
I0608 10:32:38.154145 6131 net.cpp:194] conv4 does not need backward computation.
I0608 10:32:38.154150 6131 net.cpp:194] relu3 does not need backward computation.
I0608 10:32:38.154155 6131 net.cpp:194] conv3 does not need backward computation.
I0608 10:32:38.154160 6131 net.cpp:194] norm2 does not need backward computation.
I0608 10:32:38.154165 6131 net.cpp:194] pool2 does not need backward computation.
I0608 10:32:38.154170 6131 net.cpp:194] relu2 does not need backward computation.
I0608 10:32:38.154175 6131 net.cpp:194] conv2 does not need backward computation.
I0608 10:32:38.154180 6131 net.cpp:194] norm1 does not need backward computation.
I0608 10:32:38.154193 6131 net.cpp:194] pool1 does not need backward computation.
I0608 10:32:38.154198 6131 net.cpp:194] relu1 does not need backward computation.
I0608 10:32:38.154203 6131 net.cpp:194] conv1 does not need backward computation.
I0608 10:32:38.154208 6131 net.cpp:235] This network produces output fc-rcnn
I0608 10:32:38.154220 6131 net.cpp:482] Collecting Learning Rate and Weight Decay.
I0608 10:32:38.154227 6131 net.cpp:247] Network initialization done.
I0608 10:32:38.154232 6131 net.cpp:248] Memory required for data: 62425920
E0608 10:32:38.221285 6131 upgrade_proto.cpp:618] Attempting to upgrade input file specified using deprecated V1LayerParameter: models/bvlc_reference_rcnn_ilsvrc13/bvlc_reference_rcnn_ilsvrc13.caffemodel
I0608 10:32:38.324671 6131 upgrade_proto.cpp:626] Successfully upgraded file specified using deprecated V1LayerParameter
Loading input...
selective_search_rcnn({'/home/ouxinyu/caffe-master/examples/images/fish-bike.jpg'}, '/tmp/tmpu85WGa.mat')
Processed 1570 windows in 17.131 s.
/usr/lib/python2.7/dist-packages/pandas/io/pytables.py:2487: PerformanceWarning:
your performance may suffer as PyTables will pickle object types that it cannot
map directly to c-types [inferred_type->mixed,key->block1_values] [items->['prediction']]
warnings.warn(ws, PerformanceWarning)
Saved to _temp/det_output.h5 in 0.025 s.
4. 运行后输出的文件名,选择的窗口,检测得分存放在~/_temp/det_outpu.h5文件中,查看结果:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
df = pd.read_hdf('_temp/det_output.h5', 'df')
print(df.shape)
print(df.iloc[0])
输出:
(1570, 5)
prediction [-2.62247, -2.84579, -2.85122, -3.20838, -1.94...
ymin 79.846
xmin 9.62
ymax 246.31
xmax 339.624
Name: /home/sindyz/caffe-master/examples/images/fish-bike.jpg, dtype: object
Selective Search选出了1570个区域,作为R-CNN的输入,图与图的候选框的数量随图像的内容和大小不同而改变,也就是说:selective search不是尺度不变的。
通常,detect.py在运行大量图片时是非常高效的:首先,对所有图片提取候选框,用GPU批处理这些窗口,输出结果。只要在images_file中列出图像名,就可以批处理了。
仅管本例中只给出了Imagenet的R-CNN检测,但是detect.py可以适应不同caffe模型的输入维度,批处理规模及输出类别。你可以根据需要选择模型定义和预处理模型,参考detect.py --help根据数据选择参数。
5. 加载ILSVRC13的检测类别名称,做预测的DataFrame, 注意,通过./data/ilsvrc12/get_ilsvrc12_aux.sh获取数据
with open('../data/ilsvrc12/det_synset_words.txt') as f:
labels_df = pd.DataFrame([
{
'synset_id': l.strip().split(' ')[0],
'name': ' '.join(l.strip().split(' ')[1:]).split(',')[0]
}
for l in f.readlines()
])
labels_df.sort('synset_id')
predictions_df = pd.DataFrame(np.vstack(df.prediction.values), columns=labels_df['name'])
print(predictions_df.iloc[0])
name
accordion -2.622471
airplane -2.845789
ant -2.851220
antelope -3.208377
apple -1.949950
armadillo -2.472936
artichoke -2.201685
axe -2.327404
baby bed -2.737924
backpack -2.176764
bagel -2.681061
balance beam -2.722538
banana -2.390628
band aid -1.598909
banjo -2.298197
...
trombone -2.582361
trumpet -2.352853
turtle -2.360860
tv or monitor -2.761042
unicycle -2.218468
vacuum -1.907718
violin -2.757080
volleyball -2.723690
waffle iron -2.418540
washer -2.408994
water bottle -2.174899
watercraft -2.837426
whale -3.120339
wine bottle -2.772961
zebra -2.742914
Name: 0, Length: 200, dtype: float32
plt.gray()
plt.matshow(predictions_df.values)
plt.xlabel('Classes')
plt.ylabel('Windows')
7. 取得分最大值,并输出
max_s = predictions_df.max(0)
max_s.sort(ascending=False)
print(max_s[:10])
输出
name
person 1.835771
bicycle 0.866109
unicycle 0.057079
motorcycle -0.006122
banjo -0.028208
turtle -0.189833
electric fan -0.206787
cart -0.214237
lizard -0.393519
helmet -0.477942
dtype: float32
# Find, print, and display the top detections: person and bicycle.
i = predictions_df['person'].argmax()
j = predictions_df['bicycle'].argmax()
# Show top predictions for top detection.
f = pd.Series(df['prediction'].iloc[i], index=labels_df['name'])
print('Top detection:')
print(f.order(ascending=False)[:5])
print('')
# Show top predictions for second-best detection.
f = pd.Series(df['prediction'].iloc[j], index=labels_df['name'])
print('Second-best detection:')
print(f.order(ascending=False)[:5])
# Show top detection in red, second-best top detection in blue.
im = plt.imread('examples/images/fish-bike.jpg')
plt.imshow(im)
currentAxis = plt.gca()
det = df.iloc[i]
coords = (det['xmin'], det['ymin']), det['xmax'] - det['xmin'], det['ymax'] - det['ymin']
currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor='r', linewidth=5))
det = df.iloc[j]
coords = (det['xmin'], det['ymin']), det['xmax'] - det['xmin'], det['ymax'] - det['ymin']
currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor='b', linewidth=5))
输出
Top detection:
name
person 1.835771
swimming trunks -1.150371
rubber eraser -1.231106
turtle -1.266037
plastic bag -1.303266
dtype: float32
Second-best detection:
name
bicycle 0.866109
unicycle -0.359140
scorpion -0.811621
lobster -0.982891
lamp -1.096809
dtype: float32
9. 拿所有的自行车检测,并用NMS避免窗口重叠。
def nms_detections(dets, overlap=0.3):
"""
Non-maximum suppression: Greedily select high-scoring detections and
skip detections that are significantly covered by a previously
selected detection.
This version is translated from Matlab code by Tomasz Malisiewicz,
who sped up Pedro Felzenszwalb's code.
Parameters
----------
dets: ndarray
each row is ['xmin', 'ymin', 'xmax', 'ymax', 'score']
overlap: float
minimum overlap ratio (0.3 default)
Output
------
dets: ndarray
remaining after suppression.
"""
x1 = dets[:, 0]
y1 = dets[:, 1]
x2 = dets[:, 2]
y2 = dets[:, 3]
ind = np.argsort(dets[:, 4])
w = x2 - x1
h = y2 - y1
area = (w * h).astype(float)
pick = []
while len(ind) > 0:
i = ind[-1]
pick.append(i)
ind = ind[:-1]
xx1 = np.maximum(x1[i], x1[ind])
yy1 = np.maximum(y1[i], y1[ind])
xx2 = np.minimum(x2[i], x2[ind])
yy2 = np.minimum(y2[i], y2[ind])
w = np.maximum(0., xx2 - xx1)
h = np.maximum(0., yy2 - yy1)
wh = w * h
o = wh / (area[i] + area[ind] - wh)
ind = ind[np.nonzero(o <= overlap)[0]]
return dets[pick, :]
scores = predictions_df['bicycle']
windows = df[['xmin', 'ymin', 'xmax', 'ymax']].values
dets = np.hstack((windows, scores[:, np.newaxis]))
nms_dets = nms_detections(dets)
plt.imshow(im)
currentAxis = plt.gca()
colors = ['r', 'b', 'y']
for c, det in zip(colors, nms_dets[:3]):
currentAxis.add_patch(
plt.Rectangle((det[0], det[1]), det[2]-det[0], det[3]-det[1],
fill=False, edgecolor=c, linewidth=5)
)
print 'scores:', nms_dets[:3, 4]
自行车的检测是个简单的实例,因为在训练数据中由这个类别的数据,但是人的结果是一个真正的检测因为训练数据中没有这个类别的数据。
下面,你也可以用自己的图像做检测。
11. 删除_temp目录
!rm -rf _temp
参考资料:
http://nbviewer.jupyter.org/github/BVLC/caffe/blob/master/examples/detection.ipynb
http://nbviewer.jupyter.org/github/ouxinyu/ouxinyu.github.io/blob/master/MyCodes/caffe-master/detection.ipynb