如果你想复用detectron2的data loader,需要:
Colab Notebook有注册数据集和使用标准格式进行训练的例子。
def get_dicts():
return list[dict] in the following format
from detectron2.data import DatasetCatalog
DatasetCatalog.register("my_dataset", get_dicts)
和返回数据的函数 get_dicts
字段不可用时,data loader
file_name: the full path to the image file.
sem_seg_file_name: the full path to the ground truth semantic segmentation file.
image: the image as a numpy array.
sem_seg: semantic segmentation ground truth in a 2D numpy array. Values in the array represent category labels.
height, width: integer. The shape of image.
image_id (str): a string to identify this image. Mainly used during evaluation to identify the image. Each dataset may use it for different purposes.
annotations (list[dict]): the per-instance annotations of every instance in this image. Each annotation dict may contain:
bbox (list[float]): list of 4 numbers representing the bounding box of the instance.
bbox_mode (int): the format of bbox. It must be a member of structures.BoxMode. Currently supports: BoxMode.XYXY_ABS, BoxMode.XYWH_ABS.
category_id (int): an integer in the range [0, num_categories) representing the category label. The value num_categories is reserved to represent the “background” category, if applicable.
segmentation (list[list[float]] or dict):
If list[list[float]], it represents a list of polygons, one for each connected component of the object. Each list[float] is one simple polygon in the format of [x1, y1, ..., xn, yn]. The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates, depend on whether “bbox_mode” is relative.
#若是list形式,则本参数是一个多边形列表If dict, it represents the per-pixel segmentation mask in COCO’s RLE format.
#基于像素点的maskkeypoints (list[float]): in the format of [x1, y1, v1,…, xn, yn, vn]. v[i] means the visibility of this keypoint. n must be equal to the number of keypoint categories. The Xs and Ys are either relative coordinates in [0, 1], or absolute coordinates, depend on whether “bbox_mode” is relative.
注意:COCO中的坐标点是[0, H-1/W-1]的整数。detectron2默认会在关键点坐标的绝对值上加0.5,为了顺利从不连续的坐标下标到浮点型下标。
iscrowd: 0 or 1. Whether this instance is labeled as COCO’s “crowd region”.
#实例是否是coco中的crowd region
proposal_boxes (array): 2D numpy array with shape (K, 4) representing K precomputed proposal boxes for this image.
proposal_objectness_logits (array): numpy array with shape (K, ), which corresponds to the objectness logits of proposals in ‘proposal_boxes’.
proposal_bbox_mode (int): the format of the precomputed proposal bbox. It must be a member of structures.BoxMode. Default format is BoxMode.XYXY_ABS.
from detectron2.data.datasets import register_coco_instances
register_coco_instances("my_dataset", {}, "json_annotation.json", "path/to/image/dir")
如果你要用 DatasetCatalog.register
注册一个数据集,最好用MetadataCatalog.get(dataset_name).set(name, value)
加上相应的元数据,以备后面的特征使用。以使用元数据的 thing_classes
from detectron2.data import MetadataCatalog
MetadataCatalog.get("my_dataset").thing_classes = ["person", "dog"]
thing_classes (list[str]): Used by all instance detection/segmentation tasks. A list of names for each instance/thing category. If you load a COCO format dataset, it will be automatically set by the function load_coco_json.
#实例检测和分割使用,类别名称stuff_classes (list[str]): Used by semantic and panoptic segmentation tasks. A list of names for each stuff category.
#语义分割和全景分割使用,类别名称stuff_colors (list[tuple(r, g, b)]): Pre-defined color (in [0, 255]) for each stuff category. Used for visualization. If not given, random colors are used.
keypoint_names (list[str]): Used by keypoint localization. A list of names for each keypoint.
keypoint_flip_map (list[tuple[str]]): Used by the keypoint localization task. A list of pairs of names, where each pair are the two keypoints that should be flipped if the image is flipped during augmentation.
keypoint_connection_rules: list[tuple(str, str, (r, g, b))]. Each tuple specifies a pair of keypoints that are connected and the color to use for the line between them when visualized.
thing_dataset_id_to_contiguous_id (dict[int->int]): Used by all instance detection/segmentation tasks in the COCO format. A mapping from instance class ids in the dataset to contiguous ids in range [0, #class). Will be automatically set by the function load_coco_json.
#COCO实例检测或分割中才用到,数据集中的类别id转换成连续的[0, #class]之间的数,函数自动设置。stuff_dataset_id_to_contiguous_id (dict[int->int]): Used when generating prediction json files for semantic/panoptic segmentation. A mapping from semantic segmentation class ids in the dataset to contiguous ids in [0, num_categories). It is useful for evaluation only.
json_file: The COCO annotation json file. Used by COCO evaluation for COCO-format datasets.
#COCO的标注文件panoptic_root, panoptic_json: Used by panoptic evaluation.
evaluator_type: Used by the builtin main training script to select evaluator. No need to use it if you write your own main script.