目录
一、COCO2017数据集格式
二、现有标注格式
三、格式转换
1.建立目录
2.生成train和val图片名文本文件
3.将图片移动至对应目录下
4.生成json文件
4.visdrone
1. 将annotations中的txt标签转化为xml文件
2.xml2json
训练记录:
Reference:
COCO_ROOT #根目录
├── annotations # 存放json格式的标注
│ ├── instances_train2017.json
│ └── instances_val2017.json
└── train2017 # 存放图片文件
│ ├── 000000000001.jpg
│ ├── 000000000002.jpg
│ └── 000000000003.jpg
└── val2017
├── 000000000004.jpg
└── 000000000005.jpg
COCO所有目标框标注都放在json文件中,json文件解析出来是一个字典,格式如下:
{
"info": info,
"images": [image],
"annotations": [annotation],
"categories": [categories],
"licenses": [license],
}
info记录关于数据集的一些基本信息
"info":{ "description":"This is stable 1.0 version of the 2014 MS COCO dataset.", "url":"http:\/\/mscoco.org", "version":"1.0", "year":2017, "contributor":"Microsoft COCO group", "date_created":"2017-01-27 09:11:52.357475" }
licenses是数据集遵循的一些许可,格式是list,其中内容为:
"licenses":{ "url":"http:\/\/creativecommons.org\/licenses\/by-nc-sa\/2.0\/", "id":1, "name":"Attribution-NonCommercial-ShareAlike License" }
制作自己的数据集的时候info和licenses是不需要的。只需要images,annotations和categories三个字段即可。
其中images是一个字典的列表,储存图像的文件名,高宽和id,id是图象的编号,在annotations中也用到,是唯一的。有多少张图片,该列表就有多少个字典。
# json['images'][0]
{
'file_name': '000000397133.jpg',
'height': 427,
'width': 640,
'id': 397133
}"images":{ "coco_url": "", "date_captured": "", "file_name": "000001.jpg", "flickr_url": "", "id": 1, "license": 0, "width": 416, "height": 416 }
categories
表示所有的类别,有多少类就定义多少,类别的id从1开始,0为背景。格式如下:
"categories":{ "id": int, "name": str, "supercategory": str, }
[
{'supercategory': 'person', 'id': 1, 'name': 'person'},
{'supercategory': 'vehicle', 'id': 2, 'name': 'bicycle'},
{'supercategory': 'vehicle', 'id': 3, 'name': 'car'},
{'supercategory': 'vehicle', 'id': 4, 'name': 'motorcycle'},
{'supercategory': 'vehicle', 'id': 5, 'name': 'airplane'},
{'supercategory': 'vehicle', 'id': 6, 'name': 'bus'},
{'supercategory': 'vehicle', 'id': 7, 'name': 'train'},
{'supercategory': 'vehicle', 'id': 8, 'name': 'truck'},
{'supercategory': 'vehicle', 'id': 9, 'name': 'boat'}
# ....
]
annotations是数据集中包含的实例掩膜,数量等于bounding box的数量。segmentation格式取决于这个实例是一个单个的对象(即iscrowd=0,将使用polygons格式,以多边形顶点表示)还是一组对象(即iscrowd=1,将使用RLE格式,mask编码)
annotations
是检测框的标注,一个bounding box的格式如下:
{'segmentation': [[]],
'area': 240.000,
'iscrowd': 0,
'image_id': 289343,
'bbox': [0., 0., 60., 40.],
'category_id': 1,
'id': 1768}"annotations":{ "id": int, "image_id": int, "category_id": int, "segmentation": RLE or [polygon], "area": float, "bbox": [x,y,width,height], "iscrowd": 0 or 1 } # 以多边形顶点形式表示的实例: "annotations":{ "segmentation": [[510.66,423.01,511.72,420.03,510.45......]], "area": 702.1057499999998, "iscrowd": 0, "image_id": 289343, "bbox": [473.07,395.93,38.65,28.67], "category_id": 18, "id": 1768 }
//解析其中的类别ID、图像ID:
coco = COCO(annotation_file.json)
catIds = coco.getCatIds()
imgIds = coco.getImgIds()
其中segmentation是分割的多边形,我对这个键的含义不是很懂,而且我用到的标注只有bbox,所知直接设置成了[[]],注意一定是两个列表嵌套,area是分割的面积,bbox是检测框的[x, y, w, h]坐标,category_id是类别id,与categories中对应,image_id图像的id,id是bbox的id,每个检测框是唯一的,有几个bbox,annotations里就有几个字
使用的数据来自阿里天池宫颈癌风险检测竞赛的数据集,经过预处理后获得图像及其对应的json文件标注信息,如下所示:
按照COCO数据集格式建立目录,这一步很简单,没啥可说的。
from glob import glob
import random
# 该目录存储图片数据
patch_fn_list = glob('D:/data/TianChi/Train/roi_train_total/*.jpg')
# 返回存储图片名的列表,不包含图片的后缀
patch_fn_list = [fn.split('\\')[-1][:-4] for fn in patch_fn_list]
# 将图片打乱顺序
random.shuffle(patch_fn_list)
# 按照7:3比例划分train和val
train_num = int(0.7 * len(patch_fn_list))
train_patch_list = patch_fn_list[:train_num]
valid_patch_list = patch_fn_list[train_num:]
# produce train/valid/trainval txt file
split = ['train_total', 'val_total', 'trainval_total']
for s in split:
# 存储文本文件的地址
save_path = 'D:/data/TianChi/Train/' + s + '.txt'
if s == 'train':
with open(save_path, 'w') as f:
for fn in train_patch_list:
# 将训练图像的地址写入train.txt文件
f.write('%s\n' % fn)
elif s == 'val':
with open(save_path, 'w') as f:
for fn in valid_patch_list:
# 将验证图像的地址写入val.txt文件
f.write('%s\n' % fn)
elif s == 'trainval':
with open(save_path, 'w') as f:
for fn in patch_fn_list:
# 将所有图像名的编号写入trainval.txt文件
f.write('%s\n' % fn)
print('Finish Producing %s txt file to %s' % (s, save_path))
import shutil
def my_move(datadir, trainlistdir,vallistdir,traindir,valdir):
# 打开train.txt文件
fopen = open(trainlistdir, 'r')
# 读取图片名称
file_names = fopen.readlines()
for file_name in file_names:
file_name=file_name.strip('\n')
# 图片的路径
traindata = datadir + file_name+'.jpg'
# 把图片移动至traindir路径下
# 若想复制可将move改为copy
shutil.move(traindata, traindir)
# 同上
fopen = open(vallistdir, 'r')
file_names = fopen.readlines()
for file_name in file_names:
file_name=file_name.strip('\n')
valdata = datadir + file_name+'.jpg'
shutil.move(valdata, valdir)
# 图片存储地址
datadir=r'D:\data\TianChi\Train\roi_uniform_hue\\'
# 存储训练图片名的txt文件地址
trainlistdir=r'D:\data\TianChi\Train\ImageSets\Main\train.txt'
# 存储验证图片名的txt文件地址
vallistdir=r'D:\data\TianChi\Train\ImageSets\Main\val.txt'
# coco格式数据集的train2017目录
traindir=r'D:\data\TianChi\Train\COCO_ROOT\train2017'
# coco格式数据集的val2017目录
valdir=r'D:\data\TianChi\Train\COCO_ROOT\val2017'
my_move(datadir, trainlistdir,vallistdir,traindir,valdir)
import json
import glob
import cv2 as cv
import os
class tococo(object):
def __init__(self, jpg_paths, label_path, save_path):
self.images = []
self.categories = []
self.annotations = []
# 返回每张图片的地址
self.jpgpaths = jpg_paths
self.save_path = save_path
self.label_path = label_path
# 可根据情况设置类别,这里只设置了一类
self.class_ids = {'pos': 1}
self.class_id = 1
self.coco = {}
def npz_to_coco(self):
annid = 0
for num, jpg_path in enumerate(self.jpgpaths):
imgname = jpg_path.split('\\')[-1].split('.')[0]
img = cv.imread(jpg_path)
jsonf = open(self.label_path + imgname + '.json').read() # 读取json
labels = json.loads(jsonf)
h, w = img.shape[:-1]
self.images.append(self.get_images(imgname, h, w, num))
for label in labels:
# self.categories.append(self.get_categories(label['class'], self.class_id))
px,py,pw,ph=label['x'],label['y'],label['w'],label['h']
box=[px,py,pw,ph]
print(box)
self.annotations.append(self.get_annotations(box, num, annid, label['class']))
annid = annid + 1
self.coco["images"] = self.images
self.categories.append(self.get_categories(label['class'], self.class_id))
self.coco["categories"] = self.categories
self.coco["annotations"] = self.annotations
# print(self.coco)
def get_images(self, filename, height, width, image_id):
image = {}
image["height"] = height
image['width'] = width
image["id"] = image_id
# 文件名加后缀
image["file_name"] = filename+'.jpg'
# print(image)
return image
def get_categories(self, name, class_id):
category = {}
category["supercategory"] = "Positive Cell"
# id=0
category['id'] = class_id
# name=1
category['name'] = name
# print(category)
return category
def get_annotations(self, box, image_id, ann_id, calss_name):
annotation = {}
w, h = box[2], box[3]
area = w * h
annotation['segmentation'] = [[]]
annotation['iscrowd'] = 0
# 第几张图像,从0开始
annotation['image_id'] = image_id
annotation['bbox'] = box
annotation['area'] = float(area)
# category_id=0
annotation['category_id'] = self.class_ids[calss_name]
# 第几个标注,从0开始
annotation['id'] = ann_id
# print(annotation)
return annotation
def save_json(self):
self.npz_to_coco()
label_dic = self.coco
# print(label_dic)
instances_train2017 = json.dumps(label_dic)
# 可改为instances_train2017.json
f = open(os.path.join(save_path+'\instances_val2017.json'), 'w')
f.write(instances_train2017)
f.close()
# 可改为train2017,要对应上面的
jpg_paths = glob.glob('D:\data\TianChi\Train\COCO_ROOT\\val2017\*.jpg')
# 现有的标注文件地址
label_path = r'D:\data\TianChi\Train\roi_label\\'
# 保存地址
save_path = r'D:\data\TianChi\Train\COCO_ROOT\annotations'
c = tococo(jpg_paths, label_path, save_path)
c.save_json()
至此就完成了COCO数据格式的转换,就可以用来跑模型了。上述程序仅适用于阿里天池宫颈癌风险检测竞赛的数据集,需要根据自己的数据进行修改。
visdrone是一个无人机的目标检测数据集,在很多目标检测的论文中都能看到它的身影。
标签从0到11分别为’ignored regions’,‘pedestrian’,‘people’,‘bicycle’,‘car’,‘van’,
‘truck’,‘tricycle’,‘awning-tricycle’,‘bus’,‘motor’,‘others’
现在先要用mmdetection自己训练一下这个数据集,需要把他转化为coco数据集格式
分两步走:
需要改的地方有注释,就是几个路径改一下即可
import os
from PIL import Image
# 把下面的root_dir路径改成你自己的路径即可
root_dir = r"D:\object_detection_data\datacovert\VisDrone2019-DET-val/"
annotations_dir = root_dir+"annotations/"
image_dir = root_dir + "images/"
xml_dir = root_dir+"Annotations_XML/" #在工作目录下创建Annotations_XML文件夹保存xml文件
# 下面的类别也换成你自己数据类别,也可适用于其他的数据集转换
class_name = ['ignored regions','pedestrian','people','bicycle','car','van',
'truck','tricycle','awning-tricycle','bus','motor','others']
for filename in os.listdir(annotations_dir):
fin = open(annotations_dir+filename, 'r')
image_name = filename.split('.')[0]
img = Image.open(image_dir+image_name+".jpg") # 若图像数据是“png”转换成“.png”即可
xml_name = xml_dir+image_name+'.xml'
with open(xml_name, 'w') as fout:
fout.write(''+'\n')
fout.write('\t'+'VOC2007 '+'\n')
fout.write('\t'+''+image_name+'.jpg'+' '+'\n')
fout.write('\t'+''+'\n')
fout.write('\t'+''+'\n')
fout.write('\t\t'+''+'LJ'+' '+'\n')
fout.write('\t\t'+''+'LJ'+' '+'\n')
fout.write('\t'+' '+'\n')
fout.write('\t'+''+'\n')
fout.write('\t\t'+''+str(img.size[0])+' '+'\n')
fout.write('\t\t'+''+str(img.size[1])+' '+'\n')
fout.write('\t\t'+''+'3'+' '+'\n')
fout.write('\t'+' '+'\n')
fout.write('\t'+''+'0'+' '+'\n')
for line in fin.readlines():
line = line.split(',')
fout.write('\t'+''+'\n')
fin.close()
fout.write(' ')
#!/usr/bin/python
# xml是voc的格式
# json是coco的格式
import sys, os, json, glob
import xml.etree.ElementTree as ET
INITIAL_BBOXIds = 1
# PREDEF_CLASSE = {}
PREDEF_CLASSE = { 'pedestrian': 1, 'people': 2,
'bicycle': 3, 'car': 4, 'van': 5, 'truck': 6, 'tricycle': 7,
'awning-tricycle': 8, 'bus': 9, 'motor': 10}
#我这里只想检测这十个类, 0和11没有加入转化。
# function
def get(root, name):
return root.findall(name)
def get_and_check(root, name, length):
vars = root.findall(name)
if len(vars) == 0:
raise NotImplementedError('Can not find %s in %s.'%(name, root.tag))
if length > 0 and len(vars) != length:
raise NotImplementedError('The size of %s is supposed to be %d, but is %d.'%(name, length, len(vars)))
if length == 1:
vars = vars[0]
return vars
def convert(xml_paths, out_json):
json_dict = {'images': [], 'type': 'instances',
'categories': [], 'annotations': []}
categories = PREDEF_CLASSE
bbox_id = INITIAL_BBOXIds
for image_id, xml_f in enumerate(xml_paths):
# 进度输出
sys.stdout.write('\r>> Converting image %d/%d' % (
image_id + 1, len(xml_paths)))
sys.stdout.flush()
tree = ET.parse(xml_f)
root = tree.getroot()
filename = get_and_check(root, 'filename', 1).text
size = get_and_check(root, 'size', 1)
width = int(get_and_check(size, 'width', 1).text)
height = int(get_and_check(size, 'height', 1).text)
image = {'file_name': filename, 'height': height,
'width': width, 'id': image_id + 1}
json_dict['images'].append(image)
## Cruuently we do not support segmentation
#segmented = get_and_check(root, 'segmented', 1).text
#assert segmented == '0'
for obj in get(root, 'object'):
category = get_and_check(obj, 'name', 1).text
if category not in categories:
new_id = max(categories.values()) + 1
categories[category] = new_id
category_id = categories[category]
bbox = get_and_check(obj, 'bndbox', 1)
xmin = int(get_and_check(bbox, 'xmin', 1).text) - 1
ymin = int(get_and_check(bbox, 'ymin', 1).text) - 1
xmax = int(get_and_check(bbox, 'xmax', 1).text)
ymax = int(get_and_check(bbox, 'ymax', 1).text)
if xmax <= xmin or ymax <= ymin:
continue
o_width = abs(xmax - xmin)
o_height = abs(ymax - ymin)
ann = {'area': o_width * o_height, 'iscrowd': 0, 'image_id': image_id + 1,
'bbox': [xmin, ymin, o_width, o_height], 'category_id': category_id,
'id': bbox_id, 'ignore': 0, 'segmentation': []}
json_dict['annotations'].append(ann)
bbox_id = bbox_id + 1
for cate, cid in categories.items():
cat = {'supercategory': 'none', 'id': cid, 'name': cate}
json_dict['categories'].append(cat)
# json_file = open(out_json, 'w')
# json_str = json.dumps(json_dict)
# json_file.write(json_str)
# json_file.close() # 快
json.dump(json_dict, open(out_json, 'w'), indent=4) # indent=4 更加美观显示 慢
if __name__ == '__main__':
xml_path = r'D:\object_detection_data\datacovert\VisDrone2019-DET-val/Annotations_XML/' #改一下读取xml文件位置
xml_file = glob.glob(os.path.join(xml_path, '*.xml'))
convert(xml_file, r'D:\object_detection_data\datacovert\VisDrone2019-DET-val/NEW_val.json') #这里是生成的json保存位置,改一下
如图:
这里选用的是configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py模型。
首先下载对应权重,修改权重后面的全连接层的神经元个数
两阶段通用脚本如下,修改载入的权重和保存的权重名字运行即可。
import torch
pretrained_weights = torch.load('checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth')
num_class = 10
pretrained_weights['state_dict']['roi_head.bbox_head.fc_cls.weight'].resize_(num_class+1, 1024)
pretrained_weights['state_dict']['roi_head.bbox_head.fc_cls.bias'].resize_(num_class+1)
pretrained_weights['state_dict']['roi_head.bbox_head.fc_reg.weight'].resize_(num_class*4, 1024)
pretrained_weights['state_dict']['roi_head.bbox_head.fc_reg.bias'].resize_(num_class*4)
torch.save(pretrained_weights, "faster_rcnn_r50_fpn_1x_%d.pth"%num_class)
后面加载这个修改后的权重即可。
这里我只检测十个类 ,0 和11 对应的类没有检测。
接下来需要修改和类别相关的三个地方
这里修改为visdeone要检测的类别
mmdet/datasets/coco.py下
修改完类别之后可以运行 下面这个命令检查标签对着没,对着就可以开始训练了
python tools/misc/browse_dataset.py config/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
这个数据集场景比较复杂,小物体的map非常低,还把人分成了pedestrain和people,个人感觉这俩太像了,容易误检,所以这俩的map贼低,感觉分类略多。
转好的json资源放在了百度云—链接随后放上来。
链接:https://pan.baidu.com/s/1BnpYSsViBnuT7FJq-nzxWw
提取码:1111
目标检测 – 解析VOC和COCO格式并制作自己的数据集 – X.YU (xyu.ink)
将visdrone数据集转化为coco格式并在mmdetection上训练,附上转好的json文件-CSDN博客_visdrone转coco
VisDrone2019(to yolo / voc / coco)---MMDetection数据篇-CSDN博客_visdrone2019重要