商汤科技(2018 COCO 目标检测挑战赛冠军)和香港中文大学最近开源了一个基于Pytorch实现的深度学习目标检测工具箱mmdetection,支持Faster-RCNN,Mask-RCNN,Fast-RCNN,Cascade-RCNN等主流目标检测框架。可以快速部署自己的模型。
按照官方文档建议先安装Anaconda,创建python虚拟环境,使用conda进行安装
conda create -n open-mmlab python=3.7 -y #创建名为open-mmlab,python版本为3.7的虚拟环境
conda activate open-mmlab #进入虚拟环境
conda install pytorch torchvision -c pytorch
将会安装以下包
如果不指定任何版本会安装最新的cuda10.1版的pytorch=1.4。可以指定pytorch和对应的cuda版本,如安装cuda9.0版的pytorch1.1
conda install pytorch=1.1 torchvision cudatoolkit=9.0 -c pytorch
git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
pip install mmcv cython #安装mmcv和cython
pip install albumentations>=0.3.2 imagecorruptions pycocotools six terminaltables #安装依赖包
python setup.py develop #编译mmtection库,需要等一会
pip安装时如果速度很慢,后面加 -i https://pypi.tuna.tsinghua.edu.cn/simple,使用国内源,例如使用清华源安装mmcv:
pip install mmcv -i https://pypi.tuna.tsinghua.edu.cn/simple
必须先安装mmcv,再运行setup.py编译,不然会报错。
from mmdet.apis import init_detector, inference_detector, show_result
import mmcv
config_file = 'configs/faster_rcnn_r50_fpn_1x.py'
checkpoint_file = 'checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth'
# build the model from a config file and a checkpoint file
model = init_detector(config_file, checkpoint_file, device='cuda:0')
# test a single image and show the results
img = 'test.jpg' # or img = mmcv.imread(img), which will only load it once
result = inference_detector(model, img)
# visualize the results in a new window
show_result(img, result, model.CLASSES)
# or save the visualization results to image files
show_result(img, result, model.CLASSES, out_file='result.jpg')
mmdetection支持coco, voc, cityscapes数据集,但是大多数预训练模型和配置文件都是用coco训练的,所以推荐制作coco格式数据集,coco格式文件目录如下:
mmdetection
├── data
│ ├── coco
│ │ ├── annotations
│ │ │ ├── instances_train2017.json
│ │ │ ├── instances_val2017.json
│ │ ├── train2017
│ │ │ ├── image1.png
│ │ │ ├── image2.png
│ │ │ ├── …
│ │ ├── val2017
│ │ │ ├── image1.png
│ │ │ ├── image2.png
│ │ │ ├── …
注:test使用eval的数据
图像标签文件有多种形式,csv格式作为中介可以转换成任何格式,这里推荐使用这个代码将自己的数据转换为coco格式,它支持:
import os
import glob
import pandas as pd
import xml.etree.ElementTree as ET
def xml_to_csv(path):
xml_list = []
for xml_file in glob.glob(path + '/*.xml'):
tree = ET.parse(xml_file)
root = tree.getroot()
for member in root.findall('object'):
value = (root.find('filename').text,
int(root.find('size')[0].text),
int(root.find('size')[1].text),
member[0].text,
int(member[4][0].text),
int(member[4][1].text),
int(member[4][2].text),
int(member[4][3].text)
)
xml_list.append(value)
column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']
xml_df = pd.DataFrame(xml_list, columns=column_name)
return xml_df
def main():
image_path = '/media/labels_xml' #labelImg标注的xml文件所在文件夹
csv_path = 'images_label.csv' #生成的csv文件
xml_df = xml_to_csv(image_path)
xml_df.to_csv(csv_path, index=None)
print('Successfully converted xml to csv.')
main()
# -*- coding: utf-8 -*-
import os
import json
import numpy as np
import pandas as pd
import glob
import cv2
import os
import shutil
from IPython import embed
from sklearn.model_selection import train_test_split
from tqdm import tqdm
import time
np.random.seed(41)
#0为背景
classname_to_id = {"human": 1} #此处修改为自己的类和对应的ID
class Csv2CoCo:
def __init__(self,image_dir,total_annos):
self.images = []
self.annotations = []
self.categories = []
self.img_id = 0
self.ann_id = 0
self.image_dir = image_dir
self.total_annos = total_annos
def save_coco_json(self, instance, save_path):
json.dump(instance, open(save_path, 'w'), ensure_ascii=False, indent=2) # indent=2 更加美观显示
# 由txt文件构建COCO
def to_coco(self, keys):
self._init_categories()
for key in keys:
self.images.append(self._image(key))
shapes = self.total_annos[key]
for shape in shapes:
bboxi = []
for cor in shape[:-1]:
bboxi.append(int(cor))
label = shape[-1]
annotation = self._annotation(bboxi,label)
self.annotations.append(annotation)
self.ann_id += 1
self.img_id += 1
instance = {}
instance['info'] = 'spytensor created'
instance['license'] = ['license']
instance['images'] = self.images
instance['annotations'] = self.annotations
instance['categories'] = self.categories
return instance
# 构建类别
def _init_categories(self):
for k, v in classname_to_id.items():
category = {}
category['id'] = v
category['name'] = k
self.categories.append(category)
# 构建COCO的image字段
def _image(self, path):
image = {}
img = cv2.imread(self.image_dir + path)
image['height'] = img.shape[0]
image['width'] = img.shape[1]
image['id'] = self.img_id
image['file_name'] = path
return image
# 构建COCO的annotation字段
def _annotation(self, shape,label):
# label = shape[-1]
points = shape[:4]
annotation = {}
annotation['id'] = self.ann_id
annotation['image_id'] = self.img_id
annotation['category_id'] = int(classname_to_id[label])
annotation['segmentation'] = self._get_seg(points)
annotation['bbox'] = self._get_box(points)
annotation['iscrowd'] = 0
annotation['area'] = self._get_area(points)
return annotation
# COCO的格式: [x1,y1,w,h] 对应COCO的bbox格式
def _get_box(self, points):
min_x = points[0]
min_y = points[1]
max_x = points[2]
max_y = points[3]
return [min_x, min_y, max_x - min_x, max_y - min_y]
# 计算面积
def _get_area(self, points):
min_x = points[0]
min_y = points[1]
max_x = points[2]
max_y = points[3]
return (max_x - min_x+1) * (max_y - min_y+1)
# segmentation
def _get_seg(self, points):
min_x = points[0]
min_y = points[1]
max_x = points[2]
max_y = points[3]
h = max_y - min_y
w = max_x - min_x
a = []
a.append([min_x,min_y, min_x,min_y+0.5*h, min_x,max_y, min_x+0.5*w,max_y, max_x,max_y, max_x,max_y-0.5*h, max_x,min_y, max_x-0.5*w,min_y])
return a
if __name__ == '__main__':
csv_file = "image_label.csv" #生成的csv文件
image_dir = "images/" #图像文件夹地址
saved_coco_path = "./" #生成coco格式文件的地址
# 整合csv格式标注文件
total_csv_annotations = {}
annotations = tqdm(pd.read_csv(csv_file,header=None).values)
for annotation in annotations:
time.sleep(0.1)
key = annotation[0].split(os.sep)[-1]
value = np.array([annotation[1:]])
if key in total_csv_annotations.keys():
total_csv_annotations[key] = np.concatenate((total_csv_annotations[key],value),axis=0)
else:
total_csv_annotations[key] = value
# 按照键值划分数据
total_keys = list(total_csv_annotations.keys())
#此处划分训练集和验证集,test_size=0.1代表训练集:验证集 = 9:1
train_keys, val_keys = train_test_split(total_keys, test_size=0.1)
print("train_n:", len(train_keys), 'val_n:', len(val_keys))
# 创建必须的文件夹
if not os.path.exists('%scoco/annotations/'%saved_coco_path):
os.makedirs('%scoco/annotations/'%saved_coco_path)
if not os.path.exists('%scoco/train2017/'%saved_coco_path):
os.makedirs('%scoco/train2017/'%saved_coco_path)
if not os.path.exists('%scoco/val2017/'%saved_coco_path):
os.makedirs('%scoco/val2017/'%saved_coco_path)
# 把训练集转化为COCO的json格式
l2c_train = Csv2CoCo(image_dir=image_dir,total_annos=total_csv_annotations)
train_instance = l2c_train.to_coco(train_keys)
l2c_train.save_coco_json(train_instance, '%scoco/annotations/instances_good_train2017.json'%saved_coco_path)
# 复制原始图像到train2017和val2017文件夹下
for file in train_keys:
shutil.copy(image_dir+file,"%scoco/train2017/"%saved_coco_path)
for file in val_keys:
shutil.copy(image_dir+file,"%scoco/val2017/"%saved_coco_path)
# 把验证集转化为COCO的json格式
l2c_val = Csv2CoCo(image_dir=image_dir,total_annos=total_csv_annotations)
val_instance = l2c_val.to_coco(val_keys)
l2c_val.save_coco_json(val_instance, '%scoco/annotations/instances_val2017.json'%saved_coco_path)
如果是coco格式数据集,按以下方式修改,比如我只有一个类:‘human’
CLASSES = ('human')
def coco_classes():
return [
'human'
]
num_classes=2,#类别数+1
img_scale=(213,120), #图像的宽,高,可能有好几处都要修改
imgs_per_gpu=2, #可以理解为batch size,分布式训练的时候batch size=imgs_per_gpu * GPU numbers
workers_per_gpu=2,
optimizer = dict(type='SGD', lr=0.02, momentum=0.9, weight_decay=0.0001) #lr为学习率
log_config = dict(
interval=30, #每30次打印一次log
hooks=[
dict(type='TextLoggerHook'),
dict(type='TensorboardLoggerHook') #去掉此项注释可生成tensorboard log,如果安装了tensorflow和tensorboard,可以用tensorboard查看训练情况
])
work_dir = './work_dirs/faster_rcnn_r50_fpn_1x' #模型和log保存地址
如果要使用tensorboard查看训练情况,去掉配置文件中dict(type=‘TensorboardLoggerHook’)注释,训练模型时,运行
#tensorboard --logdir={tf_log地址}
tensorboard --logdir=mmdetection/work_dirs/faster_rcnn_r50_fpn_1x/tf_log
#python tools/train.py ${模型配置文件}
python tools/train.py configs/faster_rcnn_r50_fpn_1x.py
#./tools/dist_train.sh ${模型配置文件} ${GPU数量} [可选]
./tools/dist_trian.sh configs/faster_rcnn_r50_fpn_1x.py 2
训练完之后work_dirs文件夹中会保存训练过程中的log日志文件、每个epoch的pth文件(这个文件将会用于后面的test测试)
#python tools/test.py ${配置文件} ${训练好的模型} [--out ${保存输出结果的位置}] [--eval ${验证类型}] [--show](此项可显示每个图像检测后的结果)
python tools/test.py configs/faster_rcnn_r50_fpn_1x.py \
work_dirs/faster_rcnn_r50_fpn_1x/epoch1.pth \
--eval bbox --show
#./tools/dist_test.sh ${配置文件} ${训练好的模型} ${GPU数量} [--out ${保存输出结果的位置}] [--eval ${验证类型}]
./tools/dist_test.sh configs/faster_rcnn_r50_fpn_1x.py \
work_dirs/faster_rcnn_r50_fpn_1x/epoch1.pth \
8 --out results.pkl --eval bbox
mmdetection训练,测试,学习
mmdetection源码解析
mmdetection的configs中的各项参数具体解释
官方文档的 安装,训练,测试等