detectron
, detectron2
, maskrcnn
, mmdetection
, deformable
等等这群开源的大项目。恰逢detectron2
刚出,所以先系统学习这个吧。Detectron2
还是采用coco数据集的接口进行训练。我个人不能理解的是,它想做成Caffe那种供大家二次开发的项目,但是关于里面的机制又封装的太过傻瓜,同时文档又太少。导致我这样的小可爱跑训练的时候,连debug都有些困难。但是还是把一些这个项目的学习记录记下来吧,希望能帮助大家更好上手。参考:
1. Use Custom Datasets [facebookresearch/detectron2/docs/tutorials/datasets.md]
2. Use Custom Dataloaders [facebookresearch/detectron2/docs/tutorials/data_loading.md]
3. Detectron:训练自己的数据集——将自己的数据格式转换成COCO格式 [CSDN]
4. COCO数据集的标注格式 [知乎]
5. How to train Detectron2 with Custom COCO Datasets [Dlology]
6. Create COCO Annotations From Scratch [ImMersiveLimit]
7. COCO Data format [COCO Official]
{
"info":{}, # optional 存放数据集信息
"licenses":[], # optional 存放license信息
"categories":[], # 存放类别信息
"images":[], # 存放图片信息
"annotations":[] # 存放标注框信息
}
"categories"
存放类别信息,"images"
存放图片信息,"annotations"
存放标注框信息。不同字段内结构如下:"categories":[
{
"id": int, # 类别id,是categories这个字段内的唯一标识码
"name": str, # 类别名称,为字符串
"supercategory": str # 当前类别的上一级大类名称
}
]
"images":[
{
"file_name": str, # 文件路径全称
"id": int, # 图片文件id,是images这个字段内的唯一标识码
"width": int, # 图片宽
"height": int, # 图片高
# 其余字段可选,这里不列出
}
]
"annotations":[
{
"area": double, # 矩形框面积,可以int可以double
"bbox": [double, double, double, double], # 矩形框,[x0,y0,w,h]
"category_id": int, # 所属类别id
"image_id": int, # 所属图片id
"id": int, # 矩形框id,是annotations这个字段内的唯一标识码
'iscrowd': 0/1, # 0为polygon即一个标注只有一个对象, 1为RLE即一个标注标了一堆对象,一般都是0
'segmentation': [[double, double, double, double, double, double, double, double]] # iscrowd为0时,本字段是如左格式
}
]
"info": {
"description": "COCO 2017 Dataset",
"url": "http://cocodataset.org",
"version": "1.0",
"year": 2017,
"contributor": "COCO Consortium",
"date_created": "2017/09/01"
}
"licenses": [
{
"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/",
"id": 1,
"name": "Attribution-NonCommercial-ShareAlike License"
},
{
"url": "http://creativecommons.org/licenses/by-nc/2.0/",
"id": 2,
"name": "Attribution-NonCommercial License"
},
...
]
"categories": [
{"supercategory": "person","id": 1,"name": "person"},
{"supercategory": "vehicle","id": 2,"name": "bicycle"},
{"supercategory": "vehicle","id": 3,"name": "car"},
{"supercategory": "vehicle","id": 4,"name": "motorcycle"},
{"supercategory": "vehicle","id": 5,"name": "airplane"},
...
{"supercategory": "indoor","id": 89,"name": "hair drier"},
{"supercategory": "indoor","id": 90,"name": "toothbrush"}
]
"images": [
{
"file_name": "000000397133.jpg",
"id": 397133,
"height": 427,
"width": 640,
"license": 4,
"coco_url": "http://images.cocodataset.org/val2017/000000397133.jpg",
"date_captured": "2013-11-14 17:02:52",
"flickr_url": "http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg"
},
{
"file_name": "000000037777.jpg",
"id": 37777,
"height": 230,
"width": 352,
"license": 1,
"coco_url": "http://images.cocodataset.org/val2017/000000037777.jpg",
"date_captured": "2013-11-14 20:55:31",
"flickr_url": "http://farm9.staticflickr.com/8429/7839199426_f6d48aa585_z.jpg"
},
...
]
"annotations": [
{
# segmentation这里的顺序是[[x0, y0, x1, y0, x1, y1, x0, y1]]
"segmentation": [[510.66,423.01,511.72,420.03,...,510.45,423.01]],
"area": 702.1057499999998,
"iscrowd": 0,
"image_id": 289343,
"bbox": [473.07,395.93,38.65,28.67],
"category_id": 18,
"id": 1768
},
...
{
"segmentation": {
"counts": [179,27,392,41,…,55,20],
"size": [426,640]
},
"area": 220834,
"iscrowd": 1,
"image_id": 250282,
"bbox": [0,34,639,388],
"category_id": 1,
"id": 900100250282
}
]
在博客这里待补充精简的代码示例片段