Use Custom Datasets

如果你想复用detectron2的data loader,需要:

  • 注册数据集,也就是告诉detectron2怎么读取数据集;
  • 也可选择性给你的数据集注册元数据。

详情见下面解释。

Colab Notebook有注册数据集和使用标准格式进行训练的例子。

Register a Dataset

若detectron2要得去的数据集名字为“my_dataset”,需要构造函数返回这个数据集的函数,如下:

def get_dicts():
  ...
  return list[dict] in the following format

然后告知detectron2这个函数:

from detectron2.data import DatasetCatalog
DatasetCatalog.register("my_dataset", get_dicts)

上面两行将数据集my_dataset和返回数据的函数 get_dicts联系起来了。如果不修改标准读数据和数据映射的代码,那么get_dicts就要返回detectron2的比标准数据格式。什么样的格式呢?下面会介绍~

类似实例检测,实例/语义/全景分割,关键点检测等标准任务,我们使用COCO数据库的标注形式作为标准数据格式。

一张图片标注为一个字典,有许多可选字段,同时有部分函数能根据一些字段推断其他字段,比如,image字段不可用时,data loader通过file_name读入一张图片。详细字段如下:

  • file_name: the full path to the image file.#图片路径

  • sem_seg_file_name: the full path to the ground truth semantic segmentation file.#语义分割GT路径

  • image: the image as a numpy array.#输入图片数组

  • sem_seg: semantic segmentation ground truth in a 2D numpy array. Values in the array represent category labels.#语义分割GT的2D数组

  • height, width: integer. The shape of image.#图片参数,不解释

  • image_id (str): a string to identify this image. Mainly used during evaluation to identify the image. Each dataset may use it for different purposes.#图片ID,不解释

  • annotations (list[dict]): the per-instance annotations of every instance in this image. Each annotation dict may contain:#标注信息

    • bbox (list[float]): list of 4 numbers representing the bounding box of the instance.#框

    • bbox_mode (int): the format of bbox. It must be a member of structures.BoxMode. Currently supports: BoxMode.XYXY_ABS, BoxMode.XYWH_ABS.#框的形式

    • category_id (int): an integer in the range [0, num_categories) representing the category label. The value num_categories is reserved to represent the “background” category, if applicable.#类别id,

    • segmentation (list[list[float]] or dict):#分割信息

      • If list[list[float]], it represents a list of polygons, one for each connected component of the object. Each list[float] is one simple polygon in the format of [x1, y1, ..., xn, yn]. The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates, depend on whether “bbox_mode” is relative.#若是list形式,则本参数是一个多边形列表
      • If dict, it represents the per-pixel segmentation mask in COCO’s RLE format.#基于像素点的mask
    • keypoints (list[float]): in the format of [x1, y1, v1,…, xn, yn, vn]. v[i] means the visibility of this keypoint. n must be equal to the number of keypoint categories. The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates, depend on whether “bbox_mode” is relative.#关键点,v表示关键点是否可见,n是关键点类别,x,y的取值与bbox_mode相关

      注意:COCO中的坐标点是[0, H-1/W-1]的整数。detectron2默认会在关键点坐标的绝对值上加0.5,为了顺利从不连续的坐标下标到浮点型下标。

    • iscrowd: 0 or 1. Whether this instance is labeled as COCO’s “crowd region”.#实例是否是coco中的crowd region

  • proposal_boxes (array): 2D numpy array with shape (K, 4) representing K precomputed proposal boxes for this image.#二维数组,K是图片即将给出的建议框个数

  • proposal_objectness_logits (array): numpy array with shape (K, ), which corresponds to the objectness logits of proposals in ‘proposal_boxes’.

  • proposal_bbox_mode (int): the format of the precomputed proposal bbox. It must be a member of structures.BoxMode. Default format is BoxMode.XYXY_ABS.

如果你的数据集是COCO格式,可如下简单使用:

from detectron2.data.datasets import register_coco_instances
register_coco_instances("my_dataset", {}, "json_annotation.json", "path/to/image/dir")

detectron2会处理包含元数据在内的所有细节。

“Metadata” for Datasets

数据集跟元数据相关,通过方法MetadataCatalog.get(dataset_name).some_metadata来使用。元数据是包含原始数据信息的,像类别名称、颜色,文件的根目录等,这些信息可方便用于数据增强,验证模型,可视化以及日志等。元数据的数据结构取决于下面的程序会取用那些信息。

如果你要用 DatasetCatalog.register注册一个数据集,最好用MetadataCatalog.get(dataset_name).set(name, value)加上相应的元数据,以备后面的特征使用。以使用元数据的 thing_classes为例,使用方法如下所示:

from detectron2.data import MetadataCatalog
MetadataCatalog.get("my_dataset").thing_classes = ["person", "dog"]

下面是detectron2中特征工程用到的元数据,如果你自己添加新的,有些特征可能用不了:

  • thing_classes (list[str]): Used by all instance detection/segmentation tasks. A list of names for each instance/thing category. If you load a COCO format dataset, it will be automatically set by the function load_coco_json.#实例检测和分割使用,类别名称
  • stuff_classes (list[str]): Used by semantic and panoptic segmentation tasks. A list of names for each stuff category.#语义分割和全景分割使用,类别名称
  • stuff_colors (list[tuple(r, g, b)]): Pre-defined color (in [0, 255]) for each stuff category. Used for visualization. If not given, random colors are used.
  • keypoint_names (list[str]): Used by keypoint localization. A list of names for each keypoint.
  • keypoint_flip_map (list[tuple[str]]): Used by the keypoint localization task. A list of pairs of names, where each pair are the two keypoints that should be flipped if the image is flipped during augmentation.
  • keypoint_connection_rules: list[tuple(str, str, (r, g, b))]. Each tuple specifies a pair of keypoints that are connected and the color to use for the line between them when visualized.

像COCO这样特定数据集的评测会有特殊的元数据:

  • thing_dataset_id_to_contiguous_id (dict[int->int]): Used by all instance detection/segmentation tasks in the COCO format. A mapping from instance class ids in the dataset to contiguous ids in range [0, #class). Will be automatically set by the function load_coco_json.#COCO实例检测或分割中才用到,数据集中的类别id转换成连续的[0, #class]之间的数,函数自动设置。
  • stuff_dataset_id_to_contiguous_id (dict[int->int]): Used when generating prediction json files for semantic/panoptic segmentation. A mapping from semantic segmentation class ids in the dataset to contiguous ids in [0, num_categories). It is useful for evaluation only.
  • json_file: The COCO annotation json file. Used by COCO evaluation for COCO-format datasets.#COCO的标注文件
  • panoptic_root, panoptic_json: Used by panoptic evaluation.
  • evaluator_type: Used by the builtin main training script to select evaluator. No need to use it if you write your own main script.

注意:背景中thing和stuff是不同的,可以参见文章,在detectron2中,thing用在实例水平的任务中,而stuff用在语义分割任务中,二者都用在全景分割任务中。

你可能感兴趣的:(detectron2,detectron2教程,pytorch1.3,detectron训练)