GRDDC2020数据集下载及介绍

1. 基本介绍

GRDDC是全球道路损伤检测挑战赛提供的数据集,从2018~2020每年举行一次。

2020年的数据集共21041个样本,包含三个国家:"Czech", "India", "Japan";样本量分别为:2829、7706、10506。

数据集中一共10类: ['d00', 'd01', 'd0w0', 'd10', 'd11', 'd20', 'd40', 'd43', 'd44', 'd50'],各类别含义分别为:

  • d00:纵向裂缝;
  • d01(d0w0为错误标签): 纵向拼接缝;
  • d10: 横向裂缝;
  • d11: 横向拼接缝;
  • d20:龟裂;
  • d40: 坑洞;
  • d43: 十字路口模糊;
  • d44: 白线模糊;
  • d50: 井盖;

2. 类别分布

各个类别的分布如下:

GRDDC2020数据集下载及介绍_第1张图片

统计代码如下:

import os
import xml.etree.ElementTree as ET
import tqdm


def xml_parse(target):
    """
    Arguments:
        target (annotation) : the target annotation to be made usable
            will be an ET.Element
    Returns:
        a list containing lists of bounding boxes  [bbox coords, class name]
    """
    res = []
    for obj in target.iter("object"):
        name = obj.find("name").text.strip()
        bbox = obj.find("bndbox")

        pts = ["xmin", "ymin", "xmax", "ymax"]
        bndbox = [name]
        for i, pt in enumerate(pts):
            cur_pt = int(float(bbox.find(pt).text))
            # scale height or width
            # cur_pt = cur_pt / width if i % 2 == 0 else cur_pt / height
            bndbox.append(cur_pt)

        res.append(bndbox)  # [xmin, ymin, xmax, ymax, label_name]
        # img_id = target.find('filename').text[:-4]

    return res  # [[xmin, ymin, xmax, ymax, label_ind], ... ]


if __name__ == '__main__':
    raw_path = "path_to_grddc2020/train"
    sub_dirs = ["Czech", "India", "Japan"]

    statistic = {}
    for sub_dir in sub_dirs:
        img_path_root = os.path.join(raw_path, sub_dir, 'images')
        xml_path_root = os.path.join(raw_path, sub_dir, 'annotations', 'xmls')
        for f in tqdm.tqdm(os.listdir(img_path_root), ncols=100):

            # parse xml
            xml_path = os.path.join(xml_path_root, os.path.splitext(f)[0]+'.xml')
            target = ET.parse(xml_path)
            res = xml_parse(target)


            for r in res:
                cls = r[0]
                # statistic
                if cls not in statistic.keys():
                    statistic[cls] = 1
                else:
                    statistic[cls] += 1

    print(statistic)
    for key in sorted(statistic):
        print("{}: {}".format(key, statistic[key]))

3. 百度云下载

链接: https://pan.baidu.com/s/1rwiflV7PVn_EZNyL7MQigg 提取码: ij09 
 

你可能感兴趣的:(CV数据处理,GRDDC,道路损伤数据集)