使用Pascal的coco格式标注文件是Detectron代码提供的,下载地址为https://github.com/facebookresearch/Detectron/blob/master/detectron/datasets/data/README.md#coco-minival-annotations
使用coco评测标准测出的AP50和voc评测标准测出的AP50不一致,相差好几个点(4~5).
修改评测代码maskrcnn_benchmark/data/datasets/evaluation/coco/cocoeval.py _prepare函数
# orgin code
# gt['ignore'] = 'iscrowd' in gt and gt['iscrowd']
# changed code
gt['ignore'] = gt['ignore'] if 'ignore' in gt else 0
gt['ignore'] = ('iscrowd' in gt and gt['iscrowd']) or gt['ignore'] # changed by hui
使用Pascal的coco格式标注文件是Detectron代码提供的,下载地址为https://github.com/facebookresearch/Detectron/blob/master/detectron/datasets/data/README.md#coco-minival-annotations
加载查看annotations字段,发现他有一个ignore字段,估计应该是pascal里的difficult字段之类的
In [2]: import json
In [3]: jd_gt = json.load(open('pascal_test2007.json'))
In [4]: jd_gt['annotations'][0]
Out[4]:
{'segmentation': [[47, 239, 47, 371, 195, 371, 195, 239]],
'area': 19536,
'iscrowd': 0,
'image_id': 1,
'bbox': [47, 239, 148, 132],
'category_id': 12,
'id': 1,
'ignore': 0}
但是标准的coco评测代码里并没有ignroe字段,因此,即使ignore不为0,也不会被处理,但是Pascal VOC中正好有ignore不为0的数据.
print(len(jd_gt['annotations']))
ann_gt1 = [a for a in jd_gt['annotations'] if a['iscrowd']==0]
print(len(ann_gt1))
ann_gt2 = [a for a in jd_gt['annotations'] if a['ignore']==0]
print(len(ann_gt2))
Out[]:
14976
14976
12032
因此会有这个问题,即使使用ground-truth作为检测结果AP也只有80%
代码使用的是maskrcnn_benchmark的代码
修改任何一个model的代码(比如RetinaNet:maskrcnn_benchmark/modeling/rpn/retinanet/retinanet.py)为直接返回targets测试
def forward(self, images, features, targets=None):
self.dug_eval_gt = True
cls_logits = self.head(features)
locations = self.compute_locations(cls_logits, strides=self.loc_strides)
if self.training:
return self._forward_train(locations, cls_logits, targets)
else:
if self.dug_eval_gt: # test on ground-truth
return eval_gt(self, locations, targets, images, cls_logits)
return self._forward_test(self.loc_strides, cls_logits, images.image_sizes)
def eval_gt(self, locations, targets, images, cls_logits):
targets = [t.to(locations[0].device) for t in targets]
[t.add_field("scores", torch.ones(len(t.bbox))) for t in targets]
res = targets, {}
return res
修改配置文件中的TEST为 voc_2007_test_cocostyle
DATASETS:
TRAIN: ("voc_2007_train_cocostyle", "voc_2007_val_cocostyle", "voc_2012_train_cocostyle", "voc_2012_val_cocostyle")
TEST: ("voc_2007_test_cocostyle",)
SOLVER:
CHECKPOINT_PERIOD: 7500
TEST_ITER: 1 # change here to enter test mode as soon.
OUTPUT_DIR: ./outputs/pascal/gau/base_LD2.4 # base_LD1
运行代码开始测试
export NGPUS=4
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --master_port=9990 --nproc_per_node=$NGPUS tools/train_test_net.py --config configs/pascal_voc/retina_R_50_FPN_1x_voc.yaml
性能结果如下(我秀改了area的几个定义,这个可能和标准的不同,但是前面应该相同)
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.8000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.8000
Average Precision (AP) @[ IoU=0.60 | area= all | maxDets=100 ] = 0.8000
Average Precision (AP) @[ IoU=0.70 | area= all | maxDets=100 ] = 0.8000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.8000
Average Precision (AP) @[ IoU=0.80 | area= all | maxDets=100 ] = 0.8000
Average Precision (AP) @[ IoU=0.90 | area= all | maxDets=100 ] = 0.8000
Average Precision (AP) @[ IoU=0.50:0.95 | area=smallest | maxDets=100 ] = 0.2998
Average Precision (AP) @[ IoU=0.50:0.95 | area= fpn1 | maxDets=100 ] = 0.6153
Average Precision (AP) @[ IoU=0.50:0.95 | area= fpn2 | maxDets=100 ] = 0.8134
Average Precision (AP) @[ IoU=0.50:0.95 | area= fpn3 | maxDets=100 ] = 0.9144
Average Precision (AP) @[ IoU=0.50:0.95 | area=fpn4+5 | maxDets=100 ] = 0.9594
Average Precision (AP) @[ IoU=0.50:0.95 | area=largest | maxDets=100 ] = -1.0000
前面代码处理后,会把网络检测结果(eval_gt的返回值)记录在outputs/pascal/gau/base_LD2.4/inference/voc_2007_test_cocostyle/bbox.json,进一步查看该文件
import json
det_file = "outputs/pascal/gau/base_LD2.4/inference/voc_2007_test_cocostyle/bbox.json"
gt_file = "VOC2007/Annotations/pascal_test2007.json"
image_root = 'VOC2007/JPEGImages'
jd = json.load(open(det_file))
jd_gt = json.load(open(gt_file))
print(len(jd))
print(len(jd_gt['annotations']))
Out[]:
12032 (这里如果没有修改过dataset部分的代码,就不一定是这个数字)
14976
这说明输入这里的targets已经比gt里面的box要少了,而targets是COCODataset.__getitem__直接获得的,后面的代码不会进行删减,所以这个删减应当出现在COCODataset.__getitem__,这个函数有两个地方可能发生box的删除,第一个iscrowd和ignore(这个是我自己加的)的过滤,第二处是clip_to_image,把目标超出图片的部分clip回图片,同时remove_empty=True把w<2或h<2的box移除
len_boxes1, len_boxes2 = 0, 0
def __getitem__(self, idx):
global len_boxes1, len_boxes2
......
anno = [obj for obj in anno if obj["iscrowd"] == 0]
# ######################### add by hui ####################################
if anno and "ignore" in anno[0]: # filter ignore out
anno = [obj for obj in anno if not obj["ignore"]]
###########################################################################
......
target = target.clip_to_image(remove_empty=True)
......
return img, target, idx
gt_dataset = COCODataset(gt_file, image_root, False)
print(len(gt_dataset.coco.anns))
num_box = 0
for i in tqdm(range(len(gt_dataset))):
img, target, idx = __getitem__(gt_dataset, i)
num_box += len(target.bbox)
print(num_box, len_boxes1, len_boxes2)
Out[]:
14976
12032 14976 12032
经过详细排查,正式前面说到的"ignore"字段有些目标不为0, "iscrowd"所有目标都是0
print(len(jd_gt['annotations']))
ann_gt1 = [a for a in jd_gt['annotations'] if a['iscrowd']==0]
print(len(ann_gt1))
ann_gt2 = [a for a in jd_gt['annotations'] if a['ignore']==0]
print(len(ann_gt2))
Out[]:
14976
14976
12032
但是问题是coco的评测是可以把一些目标设置成gt_ignore的,然后det按IOU匹配上它们的就是det_ignore,所有的det_ignore不参与tp和fp的计算,应当不会对结果产生影响才对,coco为不同大小目标计算AP的原理上核心一步就是把大小范围外的目标都设置为gt_ignore来实现的.
最后来到coco的评测代码maskrcnn_benchmark/data/datasets/evaluation/coco/cocoeval.py,发现它的处理是只考虑"iscrowd"字段,不会查看"ignore"字段
gt['ignore'] = 'iscrowd' in gt and gt['iscrowd']
修改为
gt['ignore'] = ('iscrowd' in gt and gt['iscrowd']) or gt['ignore'] # changed by hui
再次运行评测, OK!!!
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.60 | area= all | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.70 | area= all | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.80 | area= all | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.90 | area= all | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.50:0.95 | area=smallest | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.50:0.95 | area= fpn1 | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.50:0.95 | area= fpn2 | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.50:0.95 | area= fpn3 | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.50:0.95 | area=fpn4+5 | maxDets=100 ] = 1.0000
Average Precision (AP) @[ IoU=0.50:0.95 | area=largest | maxDets=100 ] = -1.0000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.6844
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.9907
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 1.0000
Average Recall (AR) @[ IoU=0.50:0.95 | area=smallest | maxDets=100 ] = 1.0000
Average Recall (AR) @[ IoU=0.50:0.95 | area= fpn1 | maxDets=100 ] = 1.0000
Average Recall (AR) @[ IoU=0.50:0.95 | area= fpn2 | maxDets=100 ] = 1.0000
Average Recall (AR) @[ IoU=0.50:0.95 | area= fpn3 | maxDets=100 ] = 1.0000
Average Recall (AR) @[ IoU=0.50:0.95 | area=fpn4+5 | maxDets=100 ] = 1.0000
Average Recall (AR) @[ IoU=0.50:0.95 | area=largest | maxDets=100 ] = -1.0000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ]不是100%是因为它只看每张图片的tok1,而很多图片中不止一个目标
pacal voc标注格式,一个图片一个xml,目标box信息在object字段里
VOC2007
000001.jpg
Fried Camels
Jinky the Fruit Bat
353
500
3
0