mmdet3.0系列 BaseDataset类

BaseDaset

    • 介绍
    • `__init__`
    • `self.full_init`
    • `self.load_data_list`
    • `self.parse_data_info`
    • `self.filter_data`

介绍

mmdet3.0中,原先的CusomDataset变成了BaseDataset和用于detection的BaseDetDataset。BaseDataset位于mmengine中

__init__

        # Full initialize the dataset.
        if not lazy_init:
            self.full_init()

__init__函数中,调用BaseDataset中定义的self.full_init()函数进行对象的初始化

self.full_init

self.full_init函数中,初始化分为以下几个步骤:

  • 调用self.load_data_list从annotation file中加载annotation
  • 调用self.filter_data根据filter_cfg对annotation进行过滤
  • (可选)如果self._indices不为None,调用self._get_unserialized_subset对dataset进行slice
  • (可选)如果self.serialize_data为True,调用self._serialize_data()self.data_list进行序列化
    def full_init(self):
        """Load annotation file and set ``BaseDataset._fully_initialized`` to
        True.

        If ``lazy_init=False``, ``full_init`` will be called during the
        instantiation and ``self._fully_initialized`` will be set to True. If
        ``obj._fully_initialized=False``, the class method decorated by
        ``force_full_init`` will call ``full_init`` automatically.

        Several steps to initialize annotation:

            - load_data_list: Load annotations from annotation file.
            - filter data information: Filter annotations according to
              filter_cfg.
            - slice_data: Slice dataset according to ``self._indices``
            - serialize_data: Serialize ``self.data_list`` if
            ``self.serialize_data`` is True.
        """
        if self._fully_initialized:
            return
        # load data information
        self.data_list = self.load_data_list()
        # filter illegal data, such as data that has no annotations.
        self.data_list = self.filter_data()
        # Get subset data according to indices.
        if self._indices is not None:
            self.data_list = self._get_unserialized_subset(self._indices)

        # serialize data_list
        if self.serialize_data:
            self.data_bytes, self.data_address = self._serialize_data()

        self._fully_initialized = True

self.load_data_list

self.load_data_list中,调用mmengine中的load函数读取yml、json或pickle文件,这些文件读取后会得到一个字典,该字典即是mmdet3.0中BaseDataset定义的annotation format,格式为:

{
    'metainfo':
        {
            'classes': ('person', 'bicycle', 'car', 'motorcycle'),
            ...
        },
    'data_list':
        [
            {
                "img_path": "xxx/xxx_1.jpg",
                "height": 604,
                "width": 640,
                "instances":
                [
                  {
                    "bbox": [0, 0, 10, 20],
                    "bbox_label": 1,
                    "ignore_flag": 0
                  },
                  {
                    "bbox": [10, 10, 110, 120],
                    "bbox_label": 2,
                    "ignore_flag": 0
                  }
                ]
              },
            {
                "img_path": "xxx/xxx_2.jpg",
                "height": 320,
                "width": 460,
                "instances":
                [
                  {
                    "bbox": [10, 0, 20, 20],
                    "bbox_label": 3,
                    "ignore_flag": 1,
                  }
                ]
              },
            ...
        ]
}

调用函数self.parse_data_info对data_list中的字典元素做处理,其实就是将其中的img_path字段与path prefix结合,再将处理后的字典元素加入self.data_list中,最后返回self.data_list

self.parse_data_info

该函数将上面列出的annotation解析成target format。它读入的数是data_list中的一个元素,即一个包含一张图片标注的dict。如果格式发生变化,可以override该函数以适应新标注

data_prefix: dict = dict(img_path=''),,该函数将self.data_prefix与字典中的img_path做了一个路径的join,并返回字典

    def parse_data_info(self, raw_data_info: dict) -> Union[dict, List[dict]]:
        """Parse raw annotation to target format.

        Args:
            raw_data_info (dict): Raw data information load from ``ann_file``

        Returns:
            list or list[dict]: Parsed annotation.
        """
        for prefix_key, prefix in self.data_prefix.items():
            assert prefix_key in raw_data_info, (
                f'raw_data_info: {raw_data_info} dose not contain prefix key'
                f'{prefix_key}, please check your data_prefix.')
            raw_data_info[prefix_key] = osp.join(prefix,
                                                 raw_data_info[prefix_key])
        return raw_data_info

self.filter_data

self.filter_data根据filter_cfg对annotation进行过滤。该函数没有实现,但是如果self.data_list需要根据特定方式进行过滤,可以在子类中override这个方法

    def filter_data(self) -> List[dict]:
        """Filter annotations according to filter_cfg. Defaults return all
        ``data_list``.

        If some ``data_list`` could be filtered according to specific logic,
        the subclass should override this method.

        Returns:
            list[int]: Filtered results.
        """
        return self.data_list

你可能感兴趣的:(深度学习,python,人工智能,深度学习)