COCO数据集annotations解析以及可视化

针对数据集cocoval2017,解析其annotations文件夹下person_keypoints_val2017.json文件构成以及可视化。

目录

一、person_keypoints_val2017.json的结构分析

1. “info”

2. "licenses"

3. "categories"

4. "images": list of dictionary

5. ”annotations": list o dictionary

二、将整个json文件进行拆分,分出单个图像以及其对应的标注文件以进行后续的可视化

三、json文件可视化

1. 利用COCO API

1.1 加载对应img_id/cat_id的照片并可视化

1.2 加载对应照片的annotation并可视化

1.3 加载对应照片语义分析数据并打印

2. 用openCV可视化


一、person_keypoints_val2017.json的结构分析

根据coco官网给出的data format信息(https://cocodataset.org/#format-data),总体结构如下图所示:

COCO数据集annotations解析以及可视化_第1张图片

对于任务object detection:若检测到的是单个物体,iscrowd=0,segmentation mask为[polygon]的格式,即多边形顶点的坐标,但由于单个物体可能会被遮挡,有时候需要多个polygon来表示;若检测到是多个物体的集合体(例如一群人),iscrowd=1,则采用RLE编码的格式。categories部分存储了categories_id到categories的mapping。

COCO数据集annotations解析以及可视化_第2张图片对于任务keypoints detection,在object detection的基础上增加了keypoints和num_keypoints,每个关键点都有一个可见性标志 v,v=0:未标记(在这种情况下 x=y=0),v=1:标记但不可见,v=2:标记并且可见。skeleton定义了各个关键点之间的连接性。

COCO数据集annotations解析以及可视化_第3张图片

1. “info”

"info": {
        "description": "COCO 2017 Dataset",
        "url": "http://cocodataset.org",
        "version": "1.0",
        "year": 2017,
        "contributor": "COCO Consortium",
        "date_created": "2017/09/01"
    },

2. "licenses"

"licenses": [
        {
            "url": "http://creativecommons.org/licenses/by-nc-sa/2.0/",
            "id": 1,
            "name": "Attribution-NonCommercial-ShareAlike License"
        },
        {
            "url": "http://creativecommons.org/licenses/by-nc/2.0/",
            "id": 2,
            "name": "Attribution-NonCommercial License"
        },
    # 这里仅列举两个
    ]

3. "categories"

"categories": [
        {
            "supercategory": "person",
            "id": 1,
            "name": "person",
            "keypoints": [
                "nose",
                "left_eye",
                "right_eye",
                "left_ear",
                "right_ear",
                "left_shoulder",
                "right_shoulder",
                "left_elbow",
                "right_elbow",
                "left_wrist",
                "right_wrist",
                "left_hip",
                "right_hip",
                "left_knee",
                "right_knee",
                "left_ankle",
                "right_ankle"
            ],
            "skeleton": [[16,14],[14,12],[17,15],[15,13],[12,13],[6,12],[7,13],[6,7],[6,8],[7,9],[8,10],[9,11],[2,3],[1,2],[1,3],[2,4],[3,5],[4,6],[5,7]]

          }

]

上述1、2、3是全局通用的

4. "images": list of dictionary

"images": [
        {
            "license": 4,
            "file_name": "000000397133.jpg",
            "coco_url": "http://images.cocodataset.org/val2017/000000397133.jpg",
            "height": 427,
            "width": 640,
            "date_captured": "2013-11-14 17:02:52",
            "flickr_url": "http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg",
            "id": 397133
        }
    ]

5. ”annotations": list o dictionary

"annotations": [
        {
            "segmentation": [
                [
                    446.71,
                    70.66,
                    466.07,
                    72.89,
                    ......
                ]
            ],
            "num_keypoints": 13,
            "area": 17376.91885,
            "iscrowd": 0,
            "keypoints": [
                433,
                94,
                2,
                434,
                90,
                2,
                0,
                0,
                0,
                ...
            ],
            "image_id": 397133,
            "bbox": [
                388.66,
                69.92,
                109.41,
                277.62
            ],
            "category_id": 1,
            "id": 200887
        },

       #这里只列出一个object的ann信息

]

二、将整个json文件进行拆分,分出单个图像以及其对应的标注文件以进行后续的可视化

# -*- coding:utf-8 -*-

import json

json_file = r"path\to\person_keypoints_val2017.json"
data = json.load(open(json_file, 'r'))

data_2 = {
    'info': data['info'],
    'categories': data['categories'],
    'licenses': data['licenses'],
    'images': [data['images'][0]]
}

annotation = []
imgID = data_2['images'][0]['id']
for ann in data['annotations']:
    if ann['image_id'] == imgID:
        annotation.append(ann)
data_2['annotations'] = annotation

json.dump(data_2, open(r'path\to\single_person_kp.json', 'w'), indent=4)

三、json文件可视化

1. 利用COCO API

1.1 加载对应img_id/cat_id的照片并可视化

# -*- coding:utf-8 -*-
import os
from pycocotools.coco import COCO
import skimage.io as io
import matplotlib.pyplot as plt
import pylab
pylab.rcParams['figure.figsize'] = (8.0, 10.0)
annFile = r'path\to\COCO\COCO2017\annotations\person_keypoints_val2017.json'
coco = COCO(annFile)  # COCO的初始化,读入相应的ann文件

# display COCO categories and supercategories
# coco.getCatIds()返回的是ann文件中list of所有的["categories"]["id"]
# coco.loadCats()返回的是list of对应所有id的categories(即又回到了ann文件中的字典形式)
cats = coco.loadCats(coco.getCatIds())
nms = [cat['name'] for cat in cats] # 提取各个字典中的name
print('COCO categories: \n{}\n'.format(' '.join(nms)))
nms = set([cat['supercategory'] for cat in cats]) # 提取各个字典中的supercategory
print('COCO supercategories: \n{}'.format(' '.join(nms)))
# COCO categories:
# person
# COCO supercategories:
# person

# loadImgs()返回值是单元素列表,用[0]调用,img其元素为字典,key为"license", "file_name", "coco_url", "height", "width", "date_captured", "id"
# 加载指定id照片,两种途径:
# 1)本地图片,img["file_name"]
# 2)远程图片,img["coco_url"] --> I = io.imread(img["coco_url"])
imgIds = coco.getImgIds(imgIds=[397133])  # 给定参数img_id和cat_id,返回list of对应的所有img_id和包含了指定cat_id的所有img_id
img = coco.loadImgs(imgIds)[0] # loadImgs()返回list of给出id的image,为ann文件中的字典形式,这里用[0]调取列表中第一张img信息
img_path = os.path.join(r"path\to\COCO\COCO2017\val2017", img["file_name"])
I = io.imread(img_path)
# I = io.imread(img["coco_url"])
plt.axis('off');plt.imshow(I);plt.show()

COCO数据集annotations解析以及可视化_第4张图片

1.2 加载对应照片的annotation并可视化

# 加载指定index的照片和该照片内包含的所有类别的物体并且可视化标注
catIds = []
for ann in coco.dataset['annotations']:
    if ann['image_id'] == imgIds[0]:
        catIds.append(ann['category_id'])  # 把该照片包含的所有的物体的cat_id都提取出来
plt.imshow(I); plt.axis('off')
annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco.loadAnns(annIds)
coco.showAnns(anns)
# print(img['id']==imgIds[0]) --> true
plt.show()

COCO数据集annotations解析以及可视化_第5张图片

1.3 加载对应照片语义分析数据并打印

# 加载并打印该照片包含的语义分析数据
annFile = r"E:\Datasets\COCO\COCO2017\annotations\captions_val2017.json"
coco_caps = COCO(annFile)
annIds = coco_caps.getAnnIds(imgIds=img['id'])
anns = coco_caps.loadAnns(annIds)
coco_caps.showAnns(anns)
# plt.imshow(I)
# plt.axis('off'); plt.show()

 输出:

loading annotations into memory...
Done (t=0.10s)
creating index...
index created!
A man is in a kitchen making pizzas.
Man in apron standing on front of oven with pans and bakeware
A baker is working in the kitchen rolling dough.
A person standing by a stove in a kitchen.
A table with pies being made and a person standing near a wall with pots and pans hanging on the wall.

2. 用openCV可视化

# -*- coding:utf-8 -*-
import numpy as np
from pycocotools.coco import COCO
import cv2

annFile = r'E:\Datasets\COCO\COCO2017\annotations\person_keypoints_val2017.json'
img_prefix = r'E:\Datasets\COCO\COCO2017\val2017'
aColor = [(0, 255, 0, 0), (255, 0, 0, 0), (0, 0, 255, 0), (0, 255, 255, 0)]
coco = COCO(annFile)

# getCatIds(catNms=[], supNms=[], catIds=[])
# 通过输入类别的名字、大类的名字或是类别的id,来筛选得到图片所属类别的id
catIds = coco.getCatIds(catNms=['person']) # 1
# getImgIds(imgIds=[], catIds=catIds)
# 通过图片的id或是所属种类的id得到图片的id
# 得到图片的id信息后,就可以用loadImgs得到图片的信息了
# 这里选取imgIds = 397133这张照片进行可视化
imgIds = 397133
img = coco.loadImgs(imgIds)[0]
matImg = cv2.imread('%s/%s' % (img_prefix, img['file_name']))
annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)

# 通过注释的id,得到注释的信息
anns = coco.loadAnns(annIds)
print(ann['id'] for ann in anns)  # 核实一下是不是200887 1218137
for ann in anns:
    sks = np.array(coco.loadCats(ann['category_id'])[0]['skeleton']) - 1
    # 把skeleton里的坐标值转换成0开始的index
    kp = np.array(ann['keypoints'])

    x = kp[0::3]
    y = kp[1::3]
    v = kp[2::3]

    for sk in sks:
        # c = (np.random.random((1, 3)) * 0.6 + 0.4).tolist()[0]
        c = aColor[np.random.randint(0, 4)]
        if np.all(v[sk] > 0):
            # 画点之间的连接线
            cv2.line(matImg, (x[sk][0], y[sk][0]), (x[sk][1], y[sk][1]), c, 1)
    for i in range(x.shape[0]):
        c = aColor[np.random.randint(0, 4)]
        cv2.circle(matImg, (x[i], y[i]), 2, c, lineType=1)

cv2.imshow("show", matImg)
cv2.waitKey(0)

COCO数据集annotations解析以及可视化_第6张图片

你可能感兴趣的:(COCO)