mmdetection训练自己的数据集(详细)

今天我们来说说怎么使用mmdetection来训练自己的数据集。
所用的环境:

centos=7.9.2009
python=3.7.0
cuda=10.2.89
cudnn=7.6.5
torch=1.6.0
torchvision=0.7.0

1.数据集的准备

数据的格式参考yolov3。
目录格式如下:

mmdetection
----data
--------VOCdevkit
------------VOC2007
----------------Annotations
----------------ImageSets
--------------------Main
------------------------train.txt/test.txt/…
----------------JEPGImages

2.环境准备和源码下载

具体的可以参考get_started。
    1.首先,GCC>5:

sudo yum install centos-release-scl
sudo yum install devtoolset-8-gcc*
scl enable devtoolset-8 bash
gcc -v #查看版本,只对本次会话有效

    2.安装MMCV:

# 简单的可以选择
pip install mmcv
#也可以选择 pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.6.0/index.html

    3.下载源码和环境编译:

git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
pip install -r requirements/build.txt
pip install -v -e .  # or "python setup.py develop" 后面还有一个点

3.参数修改

    1.修改mmdetection/configs/_base_/datasets/voc0712.py:

# dataset settings
dataset_type = 'VOCDataset'
data_root = 'data/VOCdevkit/'
img_norm_cfg = dict(
    mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(type='LoadAnnotations', with_bbox=True),
    dict(type='Resize', img_scale=(1000, 600), keep_ratio=True), # 修改训练图片的大小
    dict(type='RandomFlip', flip_ratio=0.5),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='DefaultFormatBundle'),
    dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
    dict(type='LoadImageFromFile'),
    dict(
        type='MultiScaleFlipAug',
        img_scale=(1000, 600), # 修改测试图片的大小
        flip=False,
        transforms=[
            dict(type='Resize', keep_ratio=True),
            dict(type='RandomFlip'),
            dict(type='Normalize', **img_norm_cfg),
            dict(type='Pad', size_divisor=32),
            dict(type='ImageToTensor', keys=['img']),
            dict(type='Collect', keys=['img']),
        ])
]
data = dict(
    samples_per_gpu=2, #batch_size修改---------------
    workers_per_gpu=2,
    train=dict(
        type='RepeatDataset',
        times=3,
        dataset=dict(
            type=dataset_type,
            ann_file=[
                data_root + 'VOC2007/ImageSets/Main/trainval.txt',
                data_root + 'VOC2007/ImageSets/Main/trainval.txt'
            ],# 这里该改为VOC2007--------
            img_prefix=[data_root + 'VOC2007/', data_root + 'VOC2007/'],# 这里该改为VOC2007--------
            pipeline=train_pipeline)),
    val=dict(
        type=dataset_type,
        ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
        img_prefix=data_root + 'VOC2007/',
        pipeline=test_pipeline),
    test=dict(
        type=dataset_type,
        ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
        img_prefix=data_root + 'VOC2007/',
        pipeline=test_pipeline))
evaluation = dict(interval=1, metric='mAP')

    2.修改mmdetection/mmdet/datasets/voc.py

class VOCDataset(XMLDataset):
	# 修改为自己的类
    CLASSES = ('aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car',
               'cat', 'chair', 'cow', 'diningtable', 'dog', 'horse',
               'motorbike', 'person', 'pottedplant', 'sheep', 'sofa', 'train',
               'tvmonitor') 

    3.修改mmdetection/mmdet/core/evaluation/class_names.py

def voc_classes():
# 修改为自己的类
    return [
        'aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 'bus', 'car', 'cat',
        'chair', 'cow', 'diningtable', 'dog', 'horse', 'motorbike', 'person',
        'pottedplant', 'sheep', 'sofa', 'train', 'tvmonitor'
    ]

    4.修改mmdetection/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
原来的:

_base_ = [
    '../_base_/models/faster_rcnn_r50_fpn.py',# 模型配置文件
    #'../_base_/datasets/coco_detection.py', 修改为VOC0712.py
    '../_base_/datasets/voc0712.py', # 数据集配置文件 VOC0712.py
    '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'# 配置学习率,迭代次数,模型加载路径等等
]
]

    5.修改mmdetection/configs/_base_/models/faster_rcnn_r50_fpn.py:

  bbox_head=dict(
            type='Shared2FCBBoxHead',
            in_channels=256,
            fc_out_channels=1024,
            roi_feat_size=7,
            num_classes=4, # 修改为自己的类别
            bbox_coder=dict(
                type='DeltaXYWHBBoxCoder',
                target_means=[0., 0., 0., 0.],
                target_stds=[0.1, 0.1, 0.2, 0.2]),
            reg_class_agnostic=False,
            loss_cls=dict(
                type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
            loss_bbox=dict(type='L1Loss', loss_weigh

    6.修改(如果有不是jpg格式的图片)mmdetection/mmdet/datasets/xml_style.py

        37|data_infos = []
           img_ids = mmcv.list_from_file(ann_file)
           jpgdir=filter(lambda x:x.endswith("jpg"),os.listdir(r"mmdetection/data/VOCdevkit/VOC2007/JPEGImages/"))
           jpgdir=[img[:-4] for img in jpgdir]
           for img_id in img_ids:
                if img_id in jpgdir:
                    filename = f'JPEGImages/{img_id}.jpg'
                else:
                    filename = f'JPEGImages/{img_id}.png'

4.训练和测试

    1.训练模型:

python ./tools/train.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py

训练生成的模型存放在:mmdetection/work_dirs
    2.模型测试:
①.命令:

cd demo/
python image_demo.py

②.模型测试图片的生成(修改mmdetection/mmdet/apis/inference.py):

    if hasattr(model, 'module'):
        model = model.module
    img = model.show_result(img, result, score_thr=score_thr, show=False)
    # plt.figure(figsize=fig_size)
    img=mmcv.bgr2rgb(img)
    from PIL import Image
    im = Image.fromarray(img)
    im.save("your_file.jpeg")
    # plt.imshow(mmcv.bgr2rgb(img))
    # plt.title(title)
    # plt.tight_layout()
    # plt.show(block=block)

最后的结果保存在/demo/

你可能感兴趣的:(目标检测,深度学习,python)