更新时间 2022.10.12 (准备环境如下,亲测可行,如遇到问题,参照版本兼容等的变化)
做好版本控制,否则在执行途中会因为版本不兼容而报各种缺失错误,选择自己显卡支持的cuda版本去安装
更改自己的cuda版本可以看这篇文章
更改cuda版本,
去官网下载runfile手动安装
官网:链接
自动搜索下载命令
pip install mmcv-full=={mmcv_version} -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
在占位符替换自己的版本信息
以下是我的机器安装
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu113/torch1.11.0/index.html
执行命令
pip install torchvision==0.12.0+cu113
再安装
pip install torch==1.11.0+cu113
检查一下版本是否正确
pip list
进入用户主目录
github克隆项目
git clone https://github.com/open-mmlab/mmdetection.git
进入项目
cd mmdetection
安装相关依赖
pip install -r requirements/build.txt
pip install -v -e .
在线标注:链接
点击 get start
拖拽训练图片
点击object dection 进行标注
插入一个自定义的标签,点击start project
可以选择polygon标注
导出coco/voc格式
博主这里以导出xml格式文件做后续处理。
在mmdetection目录下
mkdir ./data/VOCdevkit/VOC2007
后续建立如下目录
Annotations存放标注好的xml文件
JPEGImages存放图片
ImagesSets存放四个txt文件,分别记录图片的前缀名字,换行分开
第一步,修改标签类别
编辑文件 mmdetection/mmdet/datasets/voc.py
修改如下
第二步,修改mmdetection/mmdet/core/evaluation/class_names.py,将return修改为你的数据集的类别
第三步,修改模型的配置文件,以faster-rcnn为例
找到模型的母文件:mmdetection/configs/base/models/faster_rcnn_r50_fpn.py
第四步 修改数据集的配置文件(VOC格式)
路径:mmdetection/configs/base/datasets/voc0712.py
注释掉源代码,将其变更为
img_prefix=[data_root + ‘VOC2007/’]
完整路径设置
# dataset settings
dataset_type = 'VOCDataset'
data_root = 'data/VOC2007/'
# data_root = 'data/VOCdevkit/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='Resize', img_scale=(1000, 600), keep_ratio=True),
dict(type='RandomFlip', flip_ratio=0.5),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels']),
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1000, 600),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='Collect', keys=['img']),
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=dict(
type='RepeatDataset',
times=3,
dataset=dict(
type=dataset_type,
ann_file=[
# 按照路径修改
data_root + 'ImageSets/Main/trainval.txt'
# 注释掉源代码
# data_root + 'VOC2007/ImageSets/Main/trainval.txt',
# data_root + 'VOC2012/ImageSets/Main/trainval.txt'
],
# 照样修改
img_prefix=[data_root],
# 注释掉源代码
# img_prefix=[data_root + 'VOC2007/', data_root + 'VOC2012/'],
pipeline=train_pipeline)),
val=dict(
type=dataset_type,
# 同上
ann_file=data_root + 'ImageSets/Main/test.txt',
img_prefix=data_root,
# ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
# img_prefix=data_root + 'VOC2007/',
pipeline=test_pipeline),
test=dict(
type=dataset_type,
# 同上
ann_file=data_root + 'ImageSets/Main/test.txt',
img_prefix=data_root,
# ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
# img_prefix=data_root + 'VOC2007/',
pipeline=test_pipeline))
evaluation = dict(interval=1, metric='mAP')
第五步,修改训练模型配置文件,faster-rcnn-res50
找到相应的训练模型文件:mmdetection/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
设置路径
_base_ = [
'../_base_/models/faster_rcnn_r50_fpn.py', # 继承母模型配置文件
# 使用voc格式数据集用voc0712.py数据集配置文件
'../_base_/datasets/voc0712.py', # 继承数据集配置文件
# '../_base_/datasets/coco_detection.py', # 继承数据集配置文件
'../_base_/schedules/schedule_1x.py', # 继承优化器配置文件
'../_base_/default_runtime.py' # 继承运行配置文件
]
第六步,如果用的是刚才推荐的打标签网站,则要额外修改
修改mmdetection/mmdet/datasets/xml_style.py文件,将114行修改为difficult = 0 #if difficult is None else int(difficult.text)
进入mmdetection文件夹的根目录下,运行以下代码
python tools/train.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
如果报了路径错误,去xml_type.py下把后缀jpg改成大写JPG再试一试
将voc_to_coco_converter.py放在mmdetection目录下
在安装的环境执行
python ./voc_to_coco_converter.py
在VOC2007目录下的convert/coco/ 生成train.json
voc_to_coco_converter.py代码如下:
import xml.etree.ElementTree as ET
import os
import json
from datetime import datetime
import sys
import argparse
#初始化
coco = dict()
coco['images'] = []
coco['type'] = 'instances'
coco['annotations'] = []
coco['categories'] = []
category_set = dict()
image_set = set()
category_item_id = -1
image_id = 000000
annotation_id = 0
#添加新的item实例
def addCatItem(name):
global category_item_id
category_item = dict()
category_item['supercategory'] = 'none'
category_item_id += 1
category_item['id'] = category_item_id
category_item['name'] = name
coco['categories'].append(category_item)
category_set[name] = category_item_id
return category_item_id
#寻找路径并添加图片item
def addImgItem(file_name, size):
global image_id
if file_name is None:
raise Exception('Could not find filename tag in xml file.')
if size['width'] is None:
raise Exception('Could not find width tag in xml file.')
if size['height'] is None:
raise Exception('Could not find height tag in xml file.')
image_id += 1
image_item = dict()
image_item['id'] = image_id
image_item['file_name'] = file_name
image_item['width'] = size['width']
image_item['height'] = size['height']
image_item['license'] = None
image_item['flickr_url'] = None
image_item['coco_url'] = None
image_item['date_captured'] = str(datetime.today())
coco['images'].append(image_item)
image_set.add(file_name)
return image_id
#增加标注个例的item
def addAnnoItem(object_name, image_id, category_id, bbox):
global annotation_id
annotation_item = dict()
annotation_item['segmentation'] = []
seg = []
# bbox[] is x,y,w,h
# left_top
seg.append(bbox[0])
seg.append(bbox[1])
# left_bottom
seg.append(bbox[0])
seg.append(bbox[1] + bbox[3])
# right_bottom
seg.append(bbox[0] + bbox[2])
seg.append(bbox[1] + bbox[3])
# right_top
seg.append(bbox[0] + bbox[2])
seg.append(bbox[1])
annotation_item['segmentation'].append(seg)
annotation_item['area'] = bbox[2] * bbox[3]
annotation_item['iscrowd'] = 0
annotation_item['ignore'] = 0
annotation_item['image_id'] = image_id
annotation_item['bbox'] = bbox
annotation_item['category_id'] = category_id
annotation_id += 1
annotation_item['id'] = annotation_id
coco['annotations'].append(annotation_item)
#读取图片列表,放置在数组中
def read_image_ids(image_sets_file):
ids = []
with open(image_sets_file, 'r') as f:
for line in f.readlines():
ids.append(line.strip())
return ids
#转换xml文件
def parseXmlFilse(data_dir, json_save_path, split='train'):
assert os.path.exists(data_dir), "data path:{} does not exist".format(data_dir)
labelfile = split + ".txt"
image_sets_file = os.path.join(data_dir, "ImageSets", "Main", labelfile)
xml_files_list = []
if os.path.isfile(image_sets_file):
ids = read_image_ids(image_sets_file)
xml_files_list = [os.path.join(data_dir, "Annotations", f"{i}.xml") for i in ids]
elif os.path.isdir(data_dir):
# 修改此处xml的路径即可
# xml_dir = os.path.join(data_dir,"labels/voc")
xml_dir = data_dir
xml_list = os.listdir(xml_dir)
xml_files_list = [os.path.join(xml_dir, i) for i in xml_list]
for xml_file in xml_files_list:
if not xml_file.endswith('.xml'):
continue
tree = ET.parse(xml_file)
root = tree.getroot()
# 初始化
size = dict()
size['width'] = None
size['height'] = None
if root.tag != 'annotation':
raise Exception('pascal voc xml root element should be annotation, rather than {}'.format(root.tag))
# 提取图片名字
file_name = root.findtext('filename')
assert file_name is not None, "filename is not in the file"
# 提取图片 size {width,height,depth}
size_info = root.findall('size')
assert size_info is not None, "size is not in the file"
for subelem in size_info[0]:
size[subelem.tag] = int(subelem.text)
if file_name is not None and size['width'] is not None and file_name not in image_set:
# 添加coco['image'],返回当前图片ID
current_image_id = addImgItem(file_name, size)
print('add image with name: {}\tand\tsize: {}'.format(file_name, size))
elif file_name in image_set:
raise Exception('file_name duplicated')
else:
raise Exception("file name:{}\t size:{}".format(file_name, size))
# 提取一张图片内所有目标object标注信息
object_info = root.findall('object')
if len(object_info) == 0:
continue
# 遍历每个目标的标注信息
for object in object_info:
# 提取目标名字
object_name = object.findtext('name')
if object_name not in category_set:
# 创建类别索引
current_category_id = addCatItem(object_name)
else:
current_category_id = category_set[object_name]
# 初始化标签列表
bndbox = dict()
bndbox['xmin'] = None
bndbox['xmax'] = None
bndbox['ymin'] = None
bndbox['ymax'] = None
# 提取box:[xmin,ymin,xmax,ymax]
bndbox_info = object.findall('bndbox')
for box in bndbox_info[0]:
bndbox[box.tag] = int(box.text)
if bndbox['xmin'] is not None:
if object_name is None:
raise Exception('xml structure broken at bndbox tag')
if current_image_id is None:
raise Exception('xml structure broken at bndbox tag')
if current_category_id is None:
raise Exception('xml structure broken at bndbox tag')
bbox = []
# x
bbox.append(bndbox['xmin'])
# y
bbox.append(bndbox['ymin'])
# w
bbox.append(bndbox['xmax'] - bndbox['xmin'])
# h
bbox.append(bndbox['ymax'] - bndbox['ymin'])
print('add annotation with object_name:{}\timage_id:{}\tcat_id:{}\tbbox:{}'.format(object_name,
current_image_id,
current_category_id,
bbox))
addAnnoItem(object_name, current_image_id, current_category_id, bbox)
json_parent_dir = os.path.dirname(json_save_path)
if not os.path.exists(json_parent_dir):
os.makedirs(json_parent_dir)
json.dump(coco, open(json_save_path, 'w'))
print("class nums:{}".format(len(coco['categories'])))
print("image nums:{}".format(len(coco['images'])))
print("bbox nums:{}".format(len(coco['annotations'])))
if __name__ == '__main__':
"""
脚本说明:
本脚本用于将VOC格式的标注文件.xml转换为coco格式的标注文件.json
参数说明:
voc_data_dir:两种格式
1.voc2012文件夹的路径,会自动找到voc2012/imageSets/Main/xx.txt
2.xml标签文件存放的文件夹
json_save_path:json文件输出的文件夹
split:主要用于voc2012查找xx.txt,如train.txt.如果用格式2,则不会用到该参数
"""
parser = argparse.ArgumentParser()
parser.add_argument('-d', '--voc-dir', type=str, default='data/label/voc', help='voc path')
parser.add_argument('-s', '--save-path', type=str, default='./data/convert/coco/train.json', help='json save path')
parser.add_argument('-t', '--type', type=str, default='train', help='only use in voc2012/2007')
opt = parser.parse_args()
if len(sys.argv) > 1:
print(opt)
parseXmlFilse(opt.voc_dir, opt.save_path, opt.type)
else:
voc_data_dir = './data/VOCdevkit/VOC2007/Annotations/'
json_save_path = './data/VOCdevkit/VOC2007/convert/coco/train.json'
#voc_data_dir = r'D:\dataset\VOC2012\VOCdevkit\VOC2012'
#voc_data_dir = './data/labels/voc'
#json_save_path = './data/convert/coco/train.json'
split = 'train'
parseXmlFilse(data_dir=voc_data_dir, json_save_path=json_save_path, split=split)
准备classes.txt文件放在VOC2007目录下
内容是设置的标签
voc0712.py修改相应的测试文件路径
test=dict(
# 告诉模型,用的是Coco数据集类型
type='CocoDataset',
# 同上
ann_file=data_root + 'annotations/coco.json',
img_prefix='',
# ann_file=data_root + 'VOC2007/ImageSets/Main/test.txt',
# img_prefix=data_root + 'VOC2007/',
pipeline=test_pipeline)
修改./mmdet/datasets/coco.py相应的CLASSES的类别名
执行代码
python tools/dataset_converters/images2coco.py data/VOCdevkit/VOC2007/JPEGImages data/VOC2007/classes.txt coco.json
开始测试
python tools/test.py ./configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py ./work_dirs/faster_rcnn_r50_fpn_1x_coco/latest.pth
--show-dir results --format-only --eval-options "jsonfile_prefix=./results"
检测结果的信息以results.bbox.json存储在根目录下
安装seaborn
pip install seaborn
运行可视化前,确保安装了seaborn
日志mAP可视化
python tools/analysis_tools/analyze_logs.py plot_curve ./work_dirs/faster_rcnn_r50_fpn_1x_coco/20220501_151937.log.json
--keys mAP --legend mAP --out mAP_results.png
可视化结果保存至,mmdetection/mAP_results.png
如果用的是yolo系列模型还可以将loss可视化,将上述代码改为:
python tools/analysis_tools/analyze_logs.py plot_curve ./work_dirs/faster_rcnn_r50_fpn_1x_coco/20220501_151937.log.json --keys loss_cls loss_bbox --legend loss_cls loss_bbox --out loss_result.png