这个工程实战是***在Ubuntu***系统上基于jwyang发布在github上的faster R-CNN点我链接。其模型是基于***pytorch 1.0.0***开发的,请注意自己的版本号。在这篇文章中将实现检测自己的Pascal_VOC数据集。由于这不是基础教学,所以不涉及到pytorch的安装,相关内容请参考其他博文。
总结一下我的实现环境:Ubuntu16.04、 CUDA10.0、 pytorch1.0.0、torchvision 0.2.1、python3.5。
首先是clone代码:
git clone https://github.com/jwyang/faster-rcnn.pytorch.git
这里如果你和我一样用的是pytorch1.0及其以上版本需要切换git分支到torch1.0版本,因此接着输入:
git checkout pytorch-1.0
这一句相当重要,千万不要忘了!!!否则会导致后续很多错误。
完成git分支切换后,按照顺序输入以下命令配置依赖的环境:
#进入faster-rcnn.pytorch文件,同时在内创建一个data文件夹
cd faster-rcnn.pytorch && mkdir data
#安装依赖的pyhon包,这一步若是报错说没有权限访问安装就在开头加上 sudo
pip install -r requirements.txt
#下载coco kit,并make
git clone https://github.com/pdollar/coco.git
cd coco/PythonAPI
make
完成后将终端回退到faster-rcnn.pytorch目录下,在进入到lib文件夹内:
cd lib
python setup.py build develop
这里如果遇到ffi已弃用:
ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead
说明没有将faster R-CNN的版本更换到pytorch 1.0版本。你需要在git clong源代码后使用git checkout pytorch-1.0命令切换分支。
如果你成功执行到到这里,那么依赖环境的配置就已经完成了,接着就可以开始训练自己的数据集啦。
有的同学可能想立马开始测试一下刚下载的模型,很遗憾这时候还不可以,因为作者只提供了特征提取层的权重,没有后面分类、回归等的参数,所以无法直接进行测试。除非你有别人训练好的,那可以拷贝过来直接测试。
作者给出了两种特征提取网络的Pretrained Models,这里也给出下载链接:
在完成上面的步骤后,我们就可以开始准备开手着手训练了。那么首先准备自己的数据集,我此次用的是Pascal_VOC类型的数据集。这里不对Pascal_VOC数据集的制作进行详细阐述,只捡几个重点说:
注意训练时所用的集合是trainval,而不是train。
VOCdevkit
└── VOC2007
├── Annotations
├── ImageSets
│ └── Main
│ ├── test.txt
│ ├── train.txt
│ ├── trainval.txt
│ └── val.txt
└── JPEGImages
assert (boxes[:, 2] >= boxes[:, 0]).all()
该错误意思是ROI框的xmin比xmax大,ymin比ymax大,或者根本没有bndbox属性,导致错误的代码段是lib/datasets/pascal_voc.py 的line:235~241行:
for ix, obj in enumerate(objs):
bbox = obj.find('bndbox')
# Make pixel indexes 0-based
x1 = float(bbox.find('xmin').text) - 1
y1 = float(bbox.find('ymin').text) - 1
x2 = float(bbox.find('xmax').text) - 1
y2 = float(bbox.find('ymax').text) - 1
既如果标注格式是基于(0,0)的那么如果有ROI框的xmin或ymin在0位置就会导致减1越界变为了65535>>xmax/ymax。
当然造成该错误的因素还有可能是:bndbox的坐标超过了图片的宽高(标注错误,或未作图片的正畸校验)。
其解决方法参考Blog:
faster rcnn:assert (boxes[:, 2] >= boxes[:, 0]).all()分析塈VOC2007 xml坐标定义理解
assert (boxes[:, 2] >= boxes[:, 0])错误解决方法
self._classes = ('__background__', # always index 0
'aeroplane', 'bicycle', 'bird', 'boat',
'bottle', 'bus', 'car', 'cat', 'chair',
'cow', 'diningtable', 'dog', 'horse',
'motorbike', 'person', 'pottedplant',
'sheep', 'sofa', 'train', 'tvmonitor')
'''将其中的类别改为自己的类,但是注意不要动background那个类别,其用于RPN阶段softmax的二分类'''
parser.add_argument('--save_dir', dest='save_dir',
help='directory to save models', default="models",
在/faster-rcnn.pytorch目录中打开终端,输入命令
CUDA_VISIBLE_DEVICES=0 python trainval_net.py --dataset pascal_voc --net res101 \
--epochs 20 --bs 1 --nw 4 --lr 1e-2 --lr_decay_step 8 --use_tfb \
--mGPUs --cuda
参数的意义:
说明: 每一个batch 过一遍所有训练图,迭代次数iterations取决于batch_size和图片数量决定,比如有2000张图batch_size = 4,则iterations = 2000 / 4 = 500。
运行成功后,终端会输出:
usr@usr:~/faster-rcnn.pytorch$ CUDA_VISIBLE_DEVICES=0,1 python trainval_net.py --dataset pascal_voc --mGPUs --net res101 --bs 1 --nw 4 --cuda
Called with args:
Namespace(batch_size=1, checkepoch=1, checkpoint=0, checkpoint_interval=10000, checksession=1, class_agnostic=False, cuda=True, dataset='pascal_voc', disp_interval=100, imdb_name='voc_2007_trainval', imdbval_name='voc_2007_test', large_scale=False, lr=0.001, lr_decay_gamma=0.1, lr_decay_step=5, mGPUs=True, max_epochs=20, net='res101', num_workers=4, optimizer='sgd', resume=False, save_dir='/home/usrname/faster-rcnn.pytorch/models', session=1, set_cfgs=None, start_epoch=1, use_tfboard=False)
/home/usrname/faster-rcnn.pytorch/lib/model/utils/config.py:374: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
yaml_cfg = edict(yaml.load(f))
Using config:
{'ANCHOR_RATIOS': [0.5, 1, 2],
'ANCHOR_SCALES': [8, 16, 32],
'CROP_RESIZE_WITH_MAX_POOL': False,
'CUDA': False,
'DATA_DIR': '/home/usrname/faster-rcnn.pytorch/data',
'DEDUP_BOXES': 0.0625,
'EPS': 1e-14,
'EXP_DIR': 'res101',
'FEAT_STRIDE': [16],
'GPU_ID': 0,
'MATLAB': 'matlab',
'MAX_NUM_GT_BOXES': 20,
'MOBILENET': {'DEPTH_MULTIPLIER': 1.0,
'FIXED_LAYERS': 5,
'REGU_DEPTH': False,
'WEIGHT_DECAY': 4e-05},
'PIXEL_MEANS': array([[[102.9801, 115.9465, 122.7717]]]),
'POOLING_MODE': 'align',
'POOLING_SIZE': 7,
'RESNET': {'FIXED_BLOCKS': 1, 'MAX_POOL': False},
'RNG_SEED': 3,
'ROOT_DIR': '/home/usrname/faster-rcnn.pytorch',
'TEST': {'BBOX_REG': True,
'HAS_RPN': True,
'MAX_SIZE': 1000,
'MODE': 'nms',
'NMS': 0.3,
'PROPOSAL_METHOD': 'gt',
'RPN_MIN_SIZE': 16,
'RPN_NMS_THRESH': 0.7,
'RPN_POST_NMS_TOP_N': 300,
'RPN_PRE_NMS_TOP_N': 6000,
'RPN_TOP_N': 5000,
'SCALES': [600],
'SVM': False},
'TRAIN': {'ASPECT_GROUPING': False,
'BATCH_SIZE': 128,
'BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'BBOX_NORMALIZE_MEANS': [0.0, 0.0, 0.0, 0.0],
'BBOX_NORMALIZE_STDS': [0.1, 0.1, 0.2, 0.2],
'BBOX_NORMALIZE_TARGETS': True,
'BBOX_NORMALIZE_TARGETS_PRECOMPUTED': True,
'BBOX_REG': True,
'BBOX_THRESH': 0.5,
'BG_THRESH_HI': 0.5,
'BG_THRESH_LO': 0.0,
'BIAS_DECAY': False,
'BN_TRAIN': False,
'DISPLAY': 20,
'DOUBLE_BIAS': False,
'FG_FRACTION': 0.25,
'FG_THRESH': 0.5,
'GAMMA': 0.1,
'HAS_RPN': True,
'IMS_PER_BATCH': 1,
'LEARNING_RATE': 0.001,
'MAX_SIZE': 1000,
'MOMENTUM': 0.9,
'PROPOSAL_METHOD': 'gt',
'RPN_BATCHSIZE': 256,
'RPN_BBOX_INSIDE_WEIGHTS': [1.0, 1.0, 1.0, 1.0],
'RPN_CLOBBER_POSITIVES': False,
'RPN_FG_FRACTION': 0.5,
'RPN_MIN_SIZE': 8,
'RPN_NEGATIVE_OVERLAP': 0.3,
'RPN_NMS_THRESH': 0.7,
'RPN_POSITIVE_OVERLAP': 0.7,
'RPN_POSITIVE_WEIGHT': -1.0,
'RPN_POST_NMS_TOP_N': 2000,
'RPN_PRE_NMS_TOP_N': 12000,
'SCALES': [600],
'SNAPSHOT_ITERS': 5000,
'SNAPSHOT_KEPT': 3,
'SNAPSHOT_PREFIX': 'res101_faster_rcnn',
'STEPSIZE': [30000],
'SUMMARY_INTERVAL': 180,
'TRIM_HEIGHT': 600,
'TRIM_WIDTH': 600,
'TRUNCATED': False,
'USE_ALL_GT': True,
'USE_FLIPPED': True,
'USE_GT': False,
'WEIGHT_DECAY': 0.0001},
'USE_GPU_NMS': True}
Loaded dataset `voc_2007_trainval` for training
Set proposal method: gt
Appending horizontally-flipped training examples...
wrote gt roidb to /home/usrname/faster-rcnn.pytorch/data/cache/voc_2007_trainval_gt_roidb.pkl
done
Preparing training data...
done
before filtering, there are 2822 images...
after filtering, there are 2814 images...
2814 roidb entries
Loading pretrained weights from data/pretrained_model/resnet101_caffe.pth
/usr/local/lib/python3.5/dist-packages/torch/nn/parallel/_functions.py:61: UserWarning: Was asked to gather along dimension 0, but all input tensors were scalars; will instead unsqueeze and return a vector.
warnings.warn('Was asked to gather along dimension 0, but all '
[session 1][epoch 1][iter 0/2814] loss: 4.8704, lr: 1.00e-03
fg/bg=(2/126), time cost: 0.592122
rpn_cls: 0.6998, rpn_box: 1.3103, rcnn_cls: 2.8602, rcnn_box 0.0001
[session 1][epoch 1][iter 100/2814] loss: 2.1768, lr: 1.00e-03
fg/bg=(23/105), time cost: 12.911908
rpn_cls: 0.3399, rpn_box: 0.7910, rcnn_cls: 0.9739, rcnn_box 0.0449
...
修改demo.py中的168~173行代码,为自己的类别,与前面pascal_voc.py中的类别顺序对应
pascal_classes = np.asarray(['__background__',
'aeroplane', 'bicycle', 'bird', 'boat',
'bottle', 'bus', 'car', 'cat', 'chair',
'cow', 'diningtable', 'dog', 'horse',
'motorbike', 'person', 'pottedplant',
'sheep', 'sofa', 'train', 'tvmonitor'])
python demo.py --dataset pascal_voc --net res101 --cfg cfgs/res101.yml --load_dir models --checksession 1 --checkepoch 28 --checkpoint 2813 --image_dir images --cuda --bs 1
参数的意义:
fasterRCNN.load_state_dict(checkpoint['model'])
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for resnet:
size mismatch for RCNN_cls_score.weight: copying a param with shape torch.Size([16, 2048]) from checkpoint, the shape in current model is torch.Size([21, 2048]).
size mismatch for RCNN_cls_score.bias: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([21]).
size mismatch for RCNN_bbox_pred.weight: copying a param with shape torch.Size([64, 2048]) from checkpoint, the shape in current model is torch.Size([84, 2048]).
size mismatch for RCNN_bbox_pred.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([84]).
***原因:***造成的原因是在检测的时候没有将demo.py中的类别修改为自己的类
***解决方法:***更改demo.py中168~173行代码,见上文[1]。
FileNotFoundError: [Errno 2] No such file or directory: 'models/res101/pascal_voc/faster_rcnn_1_28_10021.pth'
***原因:***没找到与–load_dir models --checksession 1 --checkepoch 28 --checkpoint 10021:这四个参数相对应的models。
***解决方法:对照自己训练好的模型名字,输入上述的四个参数。
fasterRCNN.load_state_dict(checkpoint['model'])
File "/usr/local/lib/python3.5/dist-packages/torch/nn/modules/module.py", line 769, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for resnet:
size mismatch for RCNN_rpn.RPN_cls_score.weight: copying a param with shape torch.Size([18, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([24, 512, 1, 1]).
size mismatch for RCNN_rpn.RPN_cls_score.bias: copying a param with shape torch.Size([18]) from checkpoint, the shape in current model is torch.Size([24]).
size mismatch for RCNN_rpn.RPN_bbox_pred.weight: copying a param with shape torch.Size([36, 512, 1, 1]) from checkpoint, the shape in current model is torch.Size([48, 512, 1, 1]).
size mismatch for RCNN_rpn.RPN_bbox_pred.bias: copying a param with shape torch.Size([36]) from checkpoint, the shape in current model is torch.Size([48]).
size mismatch for RCNN_cls_score.weight: copying a param with shape torch.Size([16, 2048]) from checkpoint, the shape in current model is torch.Size([21, 2048]).
size mismatch for RCNN_cls_score.bias: copying a param with shape torch.Size([16]) from checkpoint, the shape in current model is torch.Size([21]).
size mismatch for RCNN_bbox_pred.weight: copying a param with shape torch.Size([64, 2048]) from checkpoint, the shape in current model is torch.Size([84, 2048]).
size mismatch for RCNN_bbox_pred.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([84]).
***原因:***这个错误是由 ./faster-rcnn.pytorch/lib/model/utils/ 目录中config.py的291~292行导致的
# Anchor scales for RPN
__C.ANCHOR_SCALES = [8,16,32]
而造成该错误的原因是:demo中定义的anchor_scales与config.py中的anchor_scales不匹配——demo中对所有数据集定义的anchor_scales是[8,16,32],而如果训练是用COCO数据集作为训练样本,那么anchor_scales的定义就是[4,8,16,32],与demo中的不匹配。
*解决方法:修改./faster-rcnn.pytorch/lib/model/utils/ 目录中config.py的291~292行为:
# Anchor scales for RPN
__C.ANCHOR_SCALES = [4,8,16,32]
最后在给出一个关于config.py中参数说明的blog:需要根据自己数据集进行网络调参的可以参考一下:
faster-rcnn(pytorch)参数配置修改
到这里fasterRCNN的实战就算是结束啦,写这篇博客作为记录总结,以备以后来回顾。如果有什么没涉及到的问题或者是文章中存在的问题欢迎大家留言讨论。
—写作不易,转载请注明出处!!!!!!!!!!
—写作不易,转载请注明出处!!!!!!!!!!
—写作不易,转载请注明出处!!!!!!!!!!