Detectron2 制作自己COCO格式的检测数据集

0. 前言

  • 一直没有去系统做过检测的任务,然后最近有这个需要,所以开始比较完整的学习一下detectron, detectron2, maskrcnn, mmdetection, deformable等等这群开源的大项目。恰逢detectron2刚出,所以先系统学习这个吧。
  • Detectron2还是采用coco数据集的接口进行训练。我个人不能理解的是,它想做成Caffe那种供大家二次开发的项目,但是关于里面的机制又封装的太过傻瓜,同时文档又太少。导致我这样的小可爱跑训练的时候,连debug都有些困难。但是还是把一些这个项目的学习记录记下来吧,希望能帮助大家更好上手。

参考:

1. Use Custom Datasets [facebookresearch/detectron2/docs/tutorials/datasets.md]

2. Use Custom Dataloaders [facebookresearch/detectron2/docs/tutorials/data_loading.md]

3. Detectron:训练自己的数据集——将自己的数据格式转换成COCO格式 [CSDN]

4. COCO数据集的标注格式 [知乎]

5. How to train Detectron2 with Custom COCO Datasets [Dlology]

6. Create COCO Annotations From Scratch [ImMersiveLimit]

7. COCO Data format [COCO Official]

1. COCO数据集文件及检测数据集格式

1)文件格式为json,文件主体是一个巨大的dict

2)dict里面有几个个键值对:

{
    "info":{},          # optional 存放数据集信息
    "licenses":[],      # optional 存放license信息
    "categories":[],    # 存放类别信息
    "images":[],        # 存放图片信息
    "annotations":[]    # 存放标注框信息
}

3)每个键值对(字段)存放不同的东西,"categories"存放类别信息,"images"存放图片信息,"annotations"存放标注框信息。不同字段内结构如下:

"categories":[
    {
    "id": int,           # 类别id,是categories这个字段内的唯一标识码
    "name": str,         # 类别名称,为字符串
    "supercategory": str # 当前类别的上一级大类名称
    }
]
"images":[
    {
    "file_name": str,    # 文件路径全称
    "id": int,           # 图片文件id,是images这个字段内的唯一标识码
    "width": int,        # 图片宽
    "height": int,       # 图片高
    # 其余字段可选,这里不列出
    }
]
"annotations":[
    {
    "area": double,      # 矩形框面积,可以int可以double
    "bbox": [double, double, double, double], # 矩形框,[x0,y0,w,h]
    "category_id": int,  # 所属类别id
    "image_id": int,     # 所属图片id
    "id": int,           # 矩形框id,是annotations这个字段内的唯一标识码
    'iscrowd': 0/1,      # 0为polygon即一个标注只有一个对象, 1为RLE即一个标注标了一堆对象,一般都是0
    'segmentation': [[double, double, double, double, double, double, double, double]]  # iscrowd为0时,本字段是如左格式
    }
]

4)每个字段的举例如下:

"info": {
    "description": "COCO 2017 Dataset",
    "url": "http://cocodataset.org",
    "version": "1.0",
    "year": 2017,
    "contributor": "COCO Consortium",
    "date_created": "2017/09/01"
}
"licenses": [
    {
        "url": "http://creativecommons.org/licenses/by-nc-sa/2.0/",
        "id": 1,
        "name": "Attribution-NonCommercial-ShareAlike License"
    },
    {
        "url": "http://creativecommons.org/licenses/by-nc/2.0/",
        "id": 2,
        "name": "Attribution-NonCommercial License"
    },
    ...
]
"categories": [
    {"supercategory": "person","id": 1,"name": "person"},
    {"supercategory": "vehicle","id": 2,"name": "bicycle"},
    {"supercategory": "vehicle","id": 3,"name": "car"},
    {"supercategory": "vehicle","id": 4,"name": "motorcycle"},
    {"supercategory": "vehicle","id": 5,"name": "airplane"},
    ...
    {"supercategory": "indoor","id": 89,"name": "hair drier"},
    {"supercategory": "indoor","id": 90,"name": "toothbrush"}
]
"images": [
    {
        "file_name": "000000397133.jpg",
        "id": 397133,
        "height": 427,
        "width": 640,
        "license": 4,
        "coco_url": "http://images.cocodataset.org/val2017/000000397133.jpg",
        "date_captured": "2013-11-14 17:02:52",
        "flickr_url": "http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg"
    },
    {
        "file_name": "000000037777.jpg",
        "id": 37777,
        "height": 230,
        "width": 352,
        "license": 1,
        "coco_url": "http://images.cocodataset.org/val2017/000000037777.jpg",
        "date_captured": "2013-11-14 20:55:31",
        "flickr_url": "http://farm9.staticflickr.com/8429/7839199426_f6d48aa585_z.jpg"
    },
    ...
]
"annotations": [
    {
        # segmentation这里的顺序是[[x0, y0, x1, y0, x1, y1, x0, y1]]
        "segmentation": [[510.66,423.01,511.72,420.03,...,510.45,423.01]],
        "area": 702.1057499999998,
        "iscrowd": 0,
        "image_id": 289343,
        "bbox": [473.07,395.93,38.65,28.67],
        "category_id": 18,
        "id": 1768
    },
    ...
    {
        "segmentation": {
            "counts": [179,27,392,41,…,55,20],
            "size": [426,640]
        },
        "area": 220834,
        "iscrowd": 1,
        "image_id": 250282,
        "bbox": [0,34,639,388],
        "category_id": 1,
        "id": 900100250282
    }
]

2. 生成数据集代码举例(以LabelImg为例)

完整的代码请见我的github,一个fork了detectron2的repo:
https://github.com/unanan/detectron2/blob/master/datasets/read_json.py

在博客这里待补充精简的代码示例片段

你可能感兴趣的:(python,detectron2,数据集)