Dataset - Visual Genome 数据集格式

Visual Genome 数据集格式

Visual Genome Readme

1. Images

  • File image part1, image part2

    全部 jpg 格式的图片

    IMAGE_ID.jpg,

2. Image meta data

  • File image_data.json.zip

    全部图片的 meta data,格式:

Name Type 类型 Description 描述
image_id int 图片ID
url hyperlink string 图片URL
width int 图片宽 pixels
height int 图片高
coco_id int 在 COCO 数据集中的图片ID
flickr_id int 在 flickr 数据集中的图片ID

如:


[...
{
"image_id": 2412112,
"url": "https://cs.stanford.edu/people/rak248/VG_100K/2370463.jpg",
"width": 500,
"height": 281,
"coco_id": 547168,
"flickr_id": 8505158818
}
...]

3. Region Descriptions

  • File region_descriptions.json.zip

    全部的 region descriptions.

Name Type 类型 Description 描述
image_id int 包含该 region 的图片ID
regions object array 该图片的 region descriptions 数组
—-.region_id int region description ID
—-.x int region bounding box 的 x 坐标值
—-.y int region bounding box 的 y 坐标值
—-.width int region bounding box 的宽
—-.height int region bounding box 的高
—-.phrase str region description phrase 区域描述短语
—-.synsets object array description 的同义词
——–.synset_name str 同义词名字
——–.entity_name str 短语
——–.entity_idx_start int 同义词在短语中开始位置索引
——–.entity_idx_end int 同义词在短语中结束位置索引

如:

“`

[…
{
“image_id”: 2407890,
“regions”: […
{
“region_id”: 1353,
“x”: 117,
“y”: 79,
“width”: 249,
“height”: 107,
“phrase”: “a cat sitting on a table.”,
“synsets”: […
{
“synset_name”: “cat.n.01”,
“entity_name”: “cat”,
“entity_idx_start”: 2,
“entity_idx_end”: 5
},
…]
},
{
“region_id”: 1354,
“x”: 116,
“y”: 29,
“width”: 239,
“height”: 135,
“phrase”: “a white cat with a tan tail and face markings”,
“synsets”: […
…]
},
…]
},
{
“image_id”: 2407890,
“regions”: […
…]
},
…]

“`

4. Question Answers (QAs)

  • File question_answers.json.zip

    全部的问题与答案(QAs).

Name Type 类型 Description 描述
image_id int 图片ID
qas object array 该图片的 QAs 列表
—-.qa_id str QA ID
—-.question str question
—-.answer str answer
—-.question_synsets object array question 中的同义词数组
——–.synset_name str 同义词名字
——–.entity_name str question 字符串
——–.entity_idx_start int question 中同义词开始位置的索引
——–.entity_idx_end int question 中同义词结束位置的索引
—-.answer_synsets object array answer 中的同义词数组
——–.synset_name str 同义词名字
——–.entity_name str answer 字符串
——–.entity_idx_start int answer 中同义词开始位置的索引
——–.entity_idx_end int answer 中同义词结束位置的索引


[...
{
"image_id": 2317993,
"qas": [...
{
"qa_id": 912402,
"question": "Where are the clouds?",
"answer": "sky",
"question_synsets": [...
{
"synset_name": "cloud.n.01",
"entity_name": "cloud",
"entity_idx_start": 14,
"entity_idx_end": 20
},
...],
"answer_synsets": [...
{
"synset_name": "sky.n.01",
"entity_name": "sky",
"entity_idx_start": 0,
"entity_idx_end": 3
},
...]
},
...]
},
...]

5. Objects

  • File objects.json.zip

    全部的 object 实例.

Name Type 类型 Description 描述
image_id int 图片ID
objects object array 该图片的 object 实例
—-.object_id int object ID
—-.x int object bounding box 的 x 坐标值
—-.y int object bounding box 的 y 坐标值
—-.w int object bounding box 的宽
—-.h int object bounding box 的高
—-.name str object 名字
—-.synsets str array 该 object 相关的同义词名字


[...
{
"image_id": 2,
"objects": [...
{
"object_id": 1023847,
"x": 405,
"y": 34,
"w": 78,
"h": 438,
"name": "pole",
"synsets": ["pole.n.01"]
},
{
"object_id": 1023836,
"x": 239,
"y": 347,
"w": 136,
"h": 126,
"name": "car",
"synsets": ["car.n.01"]
},
...]
},
...]

6. Attributes

  • File attributes.json.zip

    数据集中全部的 attributes.

Name Type 类型 Description 描述
image_id int 图片ID
attributes object array 该图片的 object 实例的 attributes 数组
—-.object_id int object ID
—-.x int object bounding box 的 x 坐标值
—-.y int object bounding box 的 y 坐标值
—-.w int object bounding box 的宽
—-.h int object bounding box 的高
—-.name str object 名字
—-.synsets str array 该 object 相关的同义词名字
如:


[...
{
"image_id": 2,
"attributes": [...
{
"object_id": 1023847,
"x": 405,
"y": 34,
"w": 78,
"h": 438,
"name": "pole",
"synsets": ["pole.n.01"],
"attributes": ["brown"]
},
{
"object_id": 1023836,
"x": 239,
"y": 347,
"w": 136,
"h": 126,
"name": "car",
"synsets": ["car.n.01"],
"attributes": ["red", "broken"]
},
...]
},
...]

7. Relationships

  • File relationships.json.zip

    全部的 relationships.

Name Type 类型 Description 描述
image_id int 图片 ID
relationships object array 图片中 relationships 数组
—-.relationship_id int relationship ID
—-.predicate int entity 字符开始索引
—-.synsets str array 与 predicate 相关的同义词名字
—-.subject int eneity 字符结束索引
——–.object_id int object ID
——–.x int object bounding box 的 x 坐标值
——–.y int object bounding box 的 y 坐标值
——–.w int object bounding box 的宽
——–.h int object bounding box 的高
——–.name str object 名字
——–.synsets str array 与该 object 相关的同义词名字
—-.object int 识别的 entity 名字
——–.object_id int object ID
——–.x int object bounding box 的 x 坐标值
——–.y int object bounding box 的 y 坐标值
——–.w int object bounding box 的宽
——–.h int object bounding box 的高
——–.name str object 名字
——–.synsets str array 与该 object 相关的同义词名字

如:


[...
{
"image_id": 2,
"relationships": [...
{
"relationship_id": 15947,
"predicate": "wears",
"synsets": ["wear.v.01"],
"subject": {
"object_id": 1023838,
"x": 324,
"y": 320,
"w": 142,
"h": 255,
"name": "man",
"synsets": ["man.n.01"]
},
"object": {
"object_id": 5071,
"x": 359,
"y": 362,
"w": 72,
"h": 81,
"name": "backpack",
"synsets": ["backpack.n.01"]
},
},
...],
}
...]

8. Synset Name & Descriptions

  • File synsets.json.zip

    全部的同义词及其描述.

Name Type 类型 Description 描述
synset_name str 唯一的同义词名字
synset_definition str 根据 WordNet 的同义词定义

如:


[...
{
"synset_name": "phonograph_record.n.01",
"synset_definition": "sound recording consisting of a disk with a continuous groove; used to reproduce music by rotating while a phonograph needle tracks in the groove",
},
{
"synset_name": "truck.n.01",
"synset_definition": "an automotive vehicle suitable for hauling",
}
...]

9. Region Graphs

  • File region_graphs.json.zip

    全部的 region graphs.

Name Type 类型 Description 描述
image_id int 包含 region 的图片 ID
regions object array 该图片的 region descriptions
—-.region_id int region description ID
—-.x int x-coordinate of region bounding box
—-.y int y-coordinate of region bounding box
—-.width int width of region bounding box
—-.height int height of region bounding box
—-.phrase str region description phrase
—-.synsets object array synsets in the description
——–.synset_name str synset name
——–.entity_name str string from phrase
——–.entity_idx_start int index where synset starts in the phrase
——–.entity_idx_end int index where synset ends in the phrase
—-.objects object array Array of object instances for this image
——–.object_id int ID of object
——–.x int x-coordinate of object bounding box
——–.y int y-coordinate of object bounding box
——–.w int width of object bounding box
——–.h int height of object bounding box
——–.name str name of object
——–.synsets str array synset names associated with this object
—-.relationships object array array of relationships in the image
——–.relationship_id int ID of relationship
——–.predicate int starting char index of entity
——–.synsets str array synset names associated with the predicate
——–.subject_id int ID of subject (found in objects list)
——–.object_id int ID of object (found in objects list)

如:


[...
{
"image_id": 2407890,
"regions": [...
{
"region_id": 1353,
"x": 117,
"y": 79,
"width": 249,
"height": 107,
"phrase": "a cat sitting on a table.",
"synsets": [...
{
"synset_name": "cat.n.01",
"entity_name": "cat",
"entity_idx_start": 2,
"entity_idx_end": 5
},
...]
"objects": [...
{
"object_id": 1023838,
"x": 324,
"y": 320,
"w": 142,
"h": 255,
"name": "cat",
"synsets": ["cat.n.01"]
},
{
"object_id": 5071,
"x": 359,
"y": 362,
"w": 72,
"h": 81,
"name": "table",
"synsets": ["table.n.01"]
},
...],
"relationships": [...
{
"relationship_id": 15947,
"predicate": "wears",
"synsets": ["wear.v.01"],
"subject_id": 1023838,
"object_id": 5071,
}
...]
},
...]
},
...]

10. Scene Graphs

  • File scene_graphs.json.zip

    全部的 scene graphs.

Name Type 类型 Description 描述
image_id int ID of image containing region
objects object array Array of object instances for this image
—-.object_id int ID of object
—-.x int x-coordinate of object bounding box
—-.y int y-coordinate of object bounding box
—-.w int width of object bounding box
—-.h int height of object bounding box
—-.name str name of object
—-.synsets str array synset names associated with this object
.relationships object array array of relationships in the image
—-.relationship_id int ID of relationship
—-.predicate int starting char index of entity
—-.synsets str array synset names associated with the predicate
—-.subject_id int ID of subject (found in objects list)
—-.object_id int ID of object (found in objects list)

如:


[...
{
"image_id": 2407890,
"objects": [...
{
"object_id": 1023838,
"x": 324,
"y": 320,
"w": 142,
"h": 255,
"name": "cat",
"synsets": ["cat.n.01"]
},
{
"object_id": 5071,
"x": 359,
"y": 362,
"w": 72,
"h": 81,
"name": "table",
"synsets": ["table.n.01"]
},
...],
"relationships": [...
{
"relationship_id": 15947,
"predicate": "wears",
"synsets": ["wear.v.01"],
"subject_id": 1023838,
"object_id": 5071,
}
...]
},
...]

11. Mapping from region based QA to region descriptions

  • File qa_to_region_mapping.json.zip

    将 QA 映射到对应的 region descriptions.

    如:

    {...
        QA_ID: REGION_DESCRIPTION_ID,
        "1885736": "2072251"
    ...}

你可能感兴趣的:(数据集)