(详细教程)mmdetection训练自己的模型,测试,评估

1. 简介

商汤科技(2018 COCO 目标检测挑战赛冠军)和香港中文大学最近开源了一个基于Pytorch实现的深度学习目标检测工具箱mmdetection,支持Faster-RCNN,Mask-RCNN,Fast-RCNN,Cascade-RCNN等主流目标检测框架。可以快速部署自己的模型。

2.环境要求

  1. Linux (不支持windows)
  2. Python 3.5+
  3. >=PyTorch 1.1
  4. >=CUDA 9.0
  5. NCCL 2
  6. >=GCC 4.9
  7. mmcv

3.安装

按照官方文档建议先安装Anaconda,创建python虚拟环境,使用conda进行安装

  1. conda创建一个虚拟环境
conda create -n open-mmlab python=3.7 -y #创建名为open-mmlab,python版本为3.7的虚拟环境
conda activate open-mmlab  #进入虚拟环境
  1. 安装pytorch及torchvision
conda install pytorch torchvision -c pytorch

将会安装以下包
(详细教程)mmdetection训练自己的模型,测试,评估_第1张图片
如果不指定任何版本会安装最新的cuda10.1版的pytorch=1.4。可以指定pytorch和对应的cuda版本,如安装cuda9.0版的pytorch1.1

conda install pytorch=1.1 torchvision cudatoolkit=9.0 -c pytorch
  1. 下载mmdetection
git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
  1. 安装mmdetection
pip install mmcv cython   #安装mmcv和cython
pip install albumentations>=0.3.2 imagecorruptions pycocotools six terminaltables #安装依赖包
python setup.py develop  #编译mmtection库,需要等一会

pip安装时如果速度很慢,后面加 -i https://pypi.tuna.tsinghua.edu.cn/simple,使用国内源,例如使用清华源安装mmcv:

pip install mmcv -i https://pypi.tuna.tsinghua.edu.cn/simple

必须先安装mmcv,再运行setup.py编译,不然会报错。

  1. 验证是否安装成功
    下载一个faster_rcnn_r50_fpn_1x的预训练模型,保存到mmdetection/checkpoints目录下,运行下面的代码,如果能显示图片,说明安装成功了。
from mmdet.apis import init_detector, inference_detector, show_result
import mmcv

config_file = 'configs/faster_rcnn_r50_fpn_1x.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth'

# build the model from a config file and a checkpoint file
model = init_detector(config_file, checkpoint_file, device='cuda:0')

# test a single image and show the results
img = 'test.jpg'  # or img = mmcv.imread(img), which will only load it once
result = inference_detector(model, img)
# visualize the results in a new window
show_result(img, result, model.CLASSES)
# or save the visualization results to image files
show_result(img, result, model.CLASSES, out_file='result.jpg')

4.制作coco格式数据集

mmdetection支持coco, voc, cityscapes数据集,但是大多数预训练模型和配置文件都是用coco训练的,所以推荐制作coco格式数据集,coco格式文件目录如下:
mmdetection
├── data
│ ├── coco
│ │ ├── annotations
│ │ │ ├── instances_train2017.json
│ │ │ ├── instances_val2017.json
│ │ ├── train2017
│ │ │ ├── image1.png
│ │ │ ├── image2.png
│ │ │ ├── …
│ │ ├── val2017
│ │ │ ├── image1.png
│ │ │ ├── image2.png
│ │ │ ├── …
注:test使用eval的数据
(详细教程)mmdetection训练自己的模型,测试,评估_第2张图片
图像标签文件有多种形式,csv格式作为中介可以转换成任何格式,这里推荐使用这个代码将自己的数据转换为coco格式,它支持:

  • csv to coco
  • csv to voc
  • labelme to coco
  • labelme to voc
  • csv to json
    这里贴出将labelImg软件标注的xml格式标签转换为csv格式,csv格式转为coco格式的代码。
  1. labelImag文件转换为csv
import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET

def xml_to_csv(path):
    xml_list = []
    for xml_file in glob.glob(path + '/*.xml'):
        tree = ET.parse(xml_file)
        root = tree.getroot()
        for member in root.findall('object'):
            value = (root.find('filename').text,
                     int(root.find('size')[0].text),
                     int(root.find('size')[1].text),
                     member[0].text,
                     int(member[4][0].text),
                     int(member[4][1].text),
                     int(member[4][2].text),
                     int(member[4][3].text)
                     )
            xml_list.append(value)
    column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
    xml_df = pd.DataFrame(xml_list, columns=column_name)
    return xml_df
    
def main():
    image_path = '/media/labels_xml'  #labelImg标注的xml文件所在文件夹
    csv_path = 'images_label.csv'     #生成的csv文件
    xml_df = xml_to_csv(image_path)
    xml_df.to_csv(csv_path, index=None)
    print('Successfully converted xml to csv.')
main()
  1. csv格式文件转为coco格式
# -*- coding: utf-8 -*-
import os
import json
import numpy as np
import pandas as pd
import glob
import cv2
import os
import shutil
from IPython import embed
from sklearn.model_selection import train_test_split
from tqdm import tqdm
import time

np.random.seed(41)
#0为背景
classname_to_id = {"human": 1} #此处修改为自己的类和对应的ID

class Csv2CoCo:

    def __init__(self,image_dir,total_annos):
        self.images = []
        self.annotations = []
        self.categories = []
        self.img_id = 0
        self.ann_id = 0
        self.image_dir = image_dir
        self.total_annos = total_annos

    def save_coco_json(self, instance, save_path):
        json.dump(instance, open(save_path, 'w'), ensure_ascii=False, indent=2)  # indent=2 更加美观显示
    # 由txt文件构建COCO
    def to_coco(self, keys):
        self._init_categories()
        for key in keys:
            self.images.append(self._image(key))
            shapes = self.total_annos[key]
            for shape in shapes:
                bboxi = []
                for cor in shape[:-1]:
                    bboxi.append(int(cor))
                label = shape[-1]
                annotation = self._annotation(bboxi,label)
                self.annotations.append(annotation)
                self.ann_id += 1
            self.img_id += 1
        instance = {}
        instance['info'] = 'spytensor created'
        instance['license'] = ['license']
        instance['images'] = self.images
        instance['annotations'] = self.annotations
        instance['categories'] = self.categories
        return instance

    # 构建类别
    def _init_categories(self):
        for k, v in classname_to_id.items():
            category = {}
            category['id'] = v
            category['name'] = k
            self.categories.append(category)

    # 构建COCO的image字段
    def _image(self, path):
        image = {}
        img = cv2.imread(self.image_dir + path)
        image['height'] = img.shape[0]
        image['width'] = img.shape[1]
        image['id'] = self.img_id
        image['file_name'] = path
        return image

    # 构建COCO的annotation字段
    def _annotation(self, shape,label):
        # label = shape[-1]
        points = shape[:4]
        annotation = {}
        annotation['id'] = self.ann_id
        annotation['image_id'] = self.img_id
        annotation['category_id'] = int(classname_to_id[label])
        annotation['segmentation'] = self._get_seg(points)
        annotation['bbox'] = self._get_box(points)
        annotation['iscrowd'] = 0
        annotation['area'] = self._get_area(points)
        return annotation

    # COCO的格式: [x1,y1,w,h] 对应COCO的bbox格式
    def _get_box(self, points):
        min_x = points[0]
        min_y = points[1]
        max_x = points[2]
        max_y = points[3]
        return [min_x, min_y, max_x - min_x, max_y - min_y]
    # 计算面积
    def _get_area(self, points):
        min_x = points[0]
        min_y = points[1]
        max_x = points[2]
        max_y = points[3]
        return (max_x - min_x+1) * (max_y - min_y+1)
    # segmentation
    def _get_seg(self, points):
        min_x = points[0]
        min_y = points[1]
        max_x = points[2]
        max_y = points[3]
        h = max_y - min_y
        w = max_x - min_x
        a = []
        a.append([min_x,min_y, min_x,min_y+0.5*h, min_x,max_y, min_x+0.5*w,max_y, max_x,max_y, max_x,max_y-0.5*h, max_x,min_y, max_x-0.5*w,min_y])
        return a
   
if __name__ == '__main__':
    csv_file = "image_label.csv"  #生成的csv文件
    image_dir = "images/"         #图像文件夹地址
    saved_coco_path = "./"        #生成coco格式文件的地址
    # 整合csv格式标注文件
    total_csv_annotations = {}
    annotations = tqdm(pd.read_csv(csv_file,header=None).values)
    for annotation in annotations:
        time.sleep(0.1)
        key = annotation[0].split(os.sep)[-1]
        value = np.array([annotation[1:]])
        if key in total_csv_annotations.keys():
            total_csv_annotations[key] = np.concatenate((total_csv_annotations[key],value),axis=0)
        else:
            total_csv_annotations[key] = value
    # 按照键值划分数据
    total_keys = list(total_csv_annotations.keys())
    #此处划分训练集和验证集,test_size=0.1代表训练集:验证集 = 9:1
    train_keys, val_keys = train_test_split(total_keys, test_size=0.1) 
    print("train_n:", len(train_keys), 'val_n:', len(val_keys))
    # 创建必须的文件夹
    if not os.path.exists('%scoco/annotations/'%saved_coco_path):
        os.makedirs('%scoco/annotations/'%saved_coco_path)
    if not os.path.exists('%scoco/train2017/'%saved_coco_path):
        os.makedirs('%scoco/train2017/'%saved_coco_path)
    if not os.path.exists('%scoco/val2017/'%saved_coco_path):
        os.makedirs('%scoco/val2017/'%saved_coco_path)
    # 把训练集转化为COCO的json格式
    l2c_train = Csv2CoCo(image_dir=image_dir,total_annos=total_csv_annotations)
    train_instance = l2c_train.to_coco(train_keys)
    l2c_train.save_coco_json(train_instance, '%scoco/annotations/instances_good_train2017.json'%saved_coco_path)
    # 复制原始图像到train2017和val2017文件夹下
    for file in train_keys:
        shutil.copy(image_dir+file,"%scoco/train2017/"%saved_coco_path)
    for file in val_keys:
        shutil.copy(image_dir+file,"%scoco/val2017/"%saved_coco_path)
    # 把验证集转化为COCO的json格式
    l2c_val = Csv2CoCo(image_dir=image_dir,total_annos=total_csv_annotations)
    val_instance = l2c_val.to_coco(val_keys)
    l2c_val.save_coco_json(val_instance, '%scoco/annotations/instances_val2017.json'%saved_coco_path)
  1. 转换完把生成的coco文件夹复制到mmdetection/data目录下,data目录需要自己创建

5.修改配置文件

如果是coco格式数据集,按以下方式修改,比如我只有一个类:‘human’

  1. 定义数据种类,在mmdetection/mmdet/datasets/coco.py。把CLASSES的那个tuple改为自己数据集对应的种类tuple即可。例如:
CLASSES = ('human')
  1. 接着在mmdetection/mmdet/core/evaluation/class_names.py修改coco_classes数据集类别,这个关系到后面test的时候结果图中显示的类别名称。例如:
def coco_classes():
    return [
        'human'
    ]
  1. 修改configs/faster_rcnn_r50_fpn_1x.py中,必改项有model字典中的num_classes、data字典中的img_scale和total_epochs 。下面对配置文件中部分项进行解释:
num_classes=2,#类别数+1
img_scale=(213,120), #图像的宽,高,可能有好几处都要修改
imgs_per_gpu=2,      #可以理解为batch size,分布式训练的时候batch size=imgs_per_gpu * GPU numbers
workers_per_gpu=2,
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001) #lr为学习率
log_config = dict(
    interval=30,  #每30次打印一次log
    hooks=[
        dict(type='TextLoggerHook'),
        dict(type='TensorboardLoggerHook')   #去掉此项注释可生成tensorboard log,如果安装了tensorflow和tensorboard,可以用tensorboard查看训练情况
    ])
work_dir = './work_dirs/faster_rcnn_r50_fpn_1x'  #模型和log保存地址 

如果要使用tensorboard查看训练情况,去掉配置文件中dict(type=‘TensorboardLoggerHook’)注释,训练模型时,运行

#tensorboard --logdir={tf_log地址}
tensorboard --logdir=mmdetection/work_dirs/faster_rcnn_r50_fpn_1x/tf_log

6. 训练模型

  1. 单GPU训练
#python tools/train.py ${模型配置文件}
python tools/train.py configs/faster_rcnn_r50_fpn_1x.py
  1. 多GPU分布式训练
#./tools/dist_train.sh ${模型配置文件} ${GPU数量} [可选]
./tools/dist_trian.sh configs/faster_rcnn_r50_fpn_1x.py 2

训练完之后work_dirs文件夹中会保存训练过程中的log日志文件、每个epoch的pth文件(这个文件将会用于后面的test测试)

7. 测试模型

  1. 单GPU测试
#python tools/test.py ${配置文件} ${训练好的模型} [--out ${保存输出结果的位置}] [--eval ${验证类型}] [--show](此项可显示每个图像检测后的结果)
python tools/test.py configs/faster_rcnn_r50_fpn_1x.py \
    work_dirs/faster_rcnn_r50_fpn_1x/epoch1.pth \
    --eval bbox --show
  1. 多GPU测试
#./tools/dist_test.sh ${配置文件} ${训练好的模型} ${GPU数量} [--out ${保存输出结果的位置}] [--eval ${验证类型}]
./tools/dist_test.sh configs/faster_rcnn_r50_fpn_1x.py \
    work_dirs/faster_rcnn_r50_fpn_1x/epoch1.pth \
   8 --out results.pkl --eval bbox 

一些有用的资料推荐

mmdetection训练,测试,学习
mmdetection源码解析
mmdetection的configs中的各项参数具体解释
官方文档的 安装,训练,测试等

你可能感兴趣的:(pytorch,图像处理)