Faster-rcnn实践

faster-rcnn+自己的人头数据集。希望顺利一点吧~

开工!

1.准备原材料

https://github.com/rbgirshick/py-faster-rcnn

远程的服务器翻不了墙,很烦,有些原料通过搜罗各处的百度云找到。

2.跑demo

./tools/demo.py

跑demo 成功

3.用预训练的VGG16模型跑VOC2007数据集

./experiments/scripts/faster_rcnn_alt_opt.sh 0 VGG16 pascalvoc

训练日志解读

中断:

home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py:100: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

  fg_inds, size=fg_rois_per_this_image, replace=False)

/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py:113: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

  bg_inds, size=bg_rois_per_this_image, replace=False)

/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py:120: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

  labels[fg_rois_per_this_image:] = 0

/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py:176: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

  bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]

/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py:177: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

  bbox_inside_weights[ind, start:end] = cfg.TRAIN.BBOX_INSIDE_WEIGHTS

I0218 18:06:15.150728  364 solver.cpp:229] Iteration 0, loss = 4.44656

I0218 18:06:15.150751  364 solver.cpp:245]    Train net output #0: loss_bbox = 0.412176 (* 1 = 0.412176 loss)

I0218 18:06:15.150758  364 solver.cpp:245]    Train net output #1: loss_cls = 4.03438 (* 1 = 4.03438 loss)

I0218 18:06:15.150761  364 sgd_solver.cpp:106] Iteration 0, lr = 0.001

F0218 18:06:15.154310  364 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0)  out of memory

*** Check failure stack trace: ***

训练了一半,,溢出。。

网上说吧batch_size调小一点就好。

用faster-rcnn默认的demo检测了一下我们的图片,嗯,效果还是很差的。。

原始demo检测PartB数据集

跑了一晚上,stage1跑完了,然后报错。


Solving...

/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py:100: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

  fg_inds, size=fg_rois_per_this_image, replace=False)

/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py:113: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

  bg_inds, size=bg_rois_per_this_image, replace=False)

/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py:120: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

  labels[fg_rois_per_this_image:] = 0

/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py:176: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

  bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]

Process Process-3:

Traceback (most recent call last):

  File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap

    self.run()

  File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run

    self._target(*self._args, **self._kwargs)

  File "./tools/train_faster_rcnn_alt_opt.py", line 195, in train_fast_rcnn

    max_iters=max_iters)

  File "/home/robot3/zrcai/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 160, in train_net

    model_paths = sw.train_model(max_iters)

  File "/home/robot3/zrcai/py-faster-rcnn/tools/../lib/fast_rcnn/train.py", line 101, in train_model

    self.solver.step(1)

  File "/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 144, in forward

    blobs = self._get_next_minibatch()

  File "/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/layer.py", line 63, in _get_next_minibatch

    return get_minibatch(minibatch_db, self._num_classes)

  File "/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 55, in get_minibatch

    num_classes)

  File "/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 125, in _sample_rois

    roidb['bbox_targets'][keep_inds, :], num_classes)

  File "/home/robot3/zrcai/py-faster-rcnn/tools/../lib/roi_data_layer/minibatch.py", line 176, in _get_bbox_regression_labels

    bbox_targets[ind, start:end] = bbox_target_data[ind, 1:]

ValueError: could not broadcast input array from shape (4) into shape (0)


据说主要是改的文件的问题。

解决方法:尝试

https://github.com/rbgirshick/py-faster-rcnn/issues/52


为了快点验证自己改的对不对,把迭代次数改为:

迭代次数在文件py-faster-rcnn/tools/train_faster_rcnn_alt_opt.py中进行修改

max_iters= [80000, 40000, 80000, 40000]

分别对应rpn第1阶段,fast rcnn第1阶段,rpn第2阶段,fast rcnn第2阶段的迭代次数。

【20000,10000,20000,10000】

改了一下重新训练,,依然不行。。

I0219 19:25:45.681346 6976 solver.cpp:229] Iteration 0, loss = 1.40653

I0219 19:25:45.681371  6976 solver.cpp:245]    Train net output #0: loss_bbox = 0.365611 (* 1 = 0.365611 loss)

I0219 19:25:45.681376  6976 solver.cpp:245]    Train net output #1: loss_cls = 1.04092 (* 1 = 1.04092 loss)

I0219 19:25:45.681381  6976 sgd_solver.cpp:106] Iteration 0, lr = 0.001

F0219 19:25:45.684870  6976 syncedmem.cpp:56] Check failed: error == cudaSuccess (2 vs. 0)  out of memory

看网上说可以改batchsize

深度学习中经常看到epoch、 iteration和batchsize,下面按自己的理解说说这三个的区别:

(1)batchsize:批大小。在深度学习中,一般采用SGD训练,即每次训练在训练集中取batchsize个样本训练;

(2)iteration:1个iteration等于使用batchsize个样本训练一次;

(3)epoch:1个epoch等于使用训练集中的全部样本训练一次;

举个例子,训练集有1000个样本,batchsize=10,那么:

训练完整个样本集需要:

100次iteration,1次epoch。

error == cudaSuccess (2 vs. 0)  out of memory解决策略:

try to shrink the network filters and inputs.

先用ZF跑一下看行不行吧。

注意 重新训练要删除缓存

删除py-faster-rcnn文件夹下所有的.pyc文件及data文件夹下的cache文件夹, data/VOCdekit2007下的annotations_cache文件夹

ZF跑不通 好,换中型VGG试一下吧。

ZF和VGG都是卡在第二阶段。跑一下看看明早起来怎样把。

终于跑完咯

下一步就增加下迭代次数吧!改回8,4,8,4

然后随机抽几张图片用demo.py测试了一下。小人头+密集人头的效果不太好。

例如:


后方小人头,一片黑点点检测不出来。

考虑做的改进:

1.调参

2.交叉训练

3.改网络

你可能感兴趣的:(Faster-rcnn实践)