今天我们来说说怎么使用mmdetection来训练自己的数据集。
所用的环境:
centos=7.9.2009
python=3.7.0
cuda=10.2.89
cudnn=7.6.5
torch=1.6.0
torchvision=0.7.0
数据的格式参考yolov3。
目录格式如下:
mmdetection
----data
--------VOCdevkit
------------VOC2007
----------------Annotations
----------------ImageSets
--------------------Main
------------------------train.txt/test.txt/…
----------------JEPGImages
具体的可以参考get_started。
1.首先,GCC>5:
sudo yum install centos-release-scl
sudo yum install devtoolset-8-gcc*
scl enable devtoolset-8 bash
gcc -v #查看版本,只对本次会话有效
2.安装MMCV:
# 简单的可以选择
pip install mmcv
#也可以选择 pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.6.0/index.html
3.下载源码和环境编译:
git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
pip install -r requirements/build.txt
pip install -v -e . # or "python setup.py develop" 后面还有一个点
1.修改mmdetection/configs/_base_/datasets/voc0712.py
:
# dataset settings
dataset_type = 'VOCDataset'
data_root = 'data/VOCdevkit/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', img_scale=(1000, 600), keep_ratio=True), # 修改训练图片的大小
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1000, 600), # 修改测试图片的大小
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
samples_per_gpu=2, #batch_size修改---------------
workers_per_gpu=2,
train=dict(
type='RepeatDataset',
times=3,
dataset=dict(
type=dataset_type,
ann_file=[
data_root + 'VOC2007/ImageSets/Main/trainval.txt',
data_root + 'VOC2007/ImageSets/Main/trainval.txt'
],# 这里该改为VOC2007--------
img_prefix=[data_root + 'VOC2007/', data_root + 'VOC2007/'],# 这里该改为VOC2007--------
pipeline=train_pipeline)),
val=dict(
type=dataset_type,
ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
img_prefix=data_root + 'VOC2007/',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
img_prefix=data_root + 'VOC2007/',
pipeline=test_pipeline))
evaluation = dict(interval=1, metric='mAP')
2.修改mmdetection/mmdet/datasets/voc.py
:
class VOCDataset(XMLDataset):
# 修改为自己的类
CLASSES = ('aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car',
'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse',
'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train',
'tvmonitor')
3.修改mmdetection/mmdet/core/evaluation/class_names.py
:
def voc_classes():
# 修改为自己的类
return [
'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'
]
4.修改mmdetection/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
:
原来的:
_base_ = [
'../_base_/models/faster_rcnn_r50_fpn.py',# 模型配置文件
#'../_base_/datasets/coco_detection.py', 修改为VOC0712.py
'../_base_/datasets/voc0712.py', # 数据集配置文件 VOC0712.py
'../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'# 配置学习率,迭代次数,模型加载路径等等
]
]
5.修改mmdetection/configs/_base_/models/faster_rcnn_r50_fpn.py
:
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=4, # 修改为自己的类别
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0., 0., 0., 0.],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='L1Loss', loss_weigh
6.修改(如果有不是jpg格式的图片)mmdetection/mmdet/datasets/xml_style.py
:
37|data_infos = []
img_ids = mmcv.list_from_file(ann_file)
jpgdir=filter(lambda x:x.endswith("jpg"),os.listdir(r"mmdetection/data/VOCdevkit/VOC2007/JPEGImages/"))
jpgdir=[img[:-4] for img in jpgdir]
for img_id in img_ids:
if img_id in jpgdir:
filename = f'JPEGImages/{img_id}.jpg'
else:
filename = f'JPEGImages/{img_id}.png'
1.训练模型:
python ./tools/train.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
训练生成的模型存放在:mmdetection/work_dirs
2.模型测试:
①.命令:
cd demo/
python image_demo.py
②.模型测试图片的生成(修改mmdetection/mmdet/apis/inference.py
):
if hasattr(model, 'module'):
model = model.module
img = model.show_result(img, result, score_thr=score_thr, show=False)
# plt.figure(figsize=fig_size)
img=mmcv.bgr2rgb(img)
from PIL import Image
im = Image.fromarray(img)
im.save("your_file.jpeg")
# plt.imshow(mmcv.bgr2rgb(img))
# plt.title(title)
# plt.tight_layout()
# plt.show(block=block)
最后的结果保存在/demo/
。