使用cocoapi遇到的坑及爬坑记录

使用cocoapi遇到的坑及爬坑记录

  • 问题一:模型评估阶段,数据类型不匹配
  • 问题二:模型评估阶段,读取预测结果时传的列表为空

近期在做基于coco数据集的实验,这两天又幸运地薅到了实验室一台服务器,搬运一波代码配好环境之后发现在服务器上使用coco自带的api做evaluation的时候报错了,卡了好久才把问题都解决。以下是遇到的两个问题以及爬坑的记录。

问题一:模型评估阶段,数据类型不匹配

错误信息
TypeError: object of type cannot be safely interpreted as an integer

原因
问题出在 cocoapi 库的 cocoval.py 中的 Params类,代码如下:

class Params:
    '''
    Params for coco evaluation api
    '''
    def setDetParams(self):
        self.imgIds = []
        self.catIds = []
        # np.arange causes trouble.  the data point on arange is slightly larger than the true value
        self.iouThrs = np.linspace(.5, 0.95, int(np.round((0.95 - .5) / .05)) + 1, endpoint=True)
        self.recThrs = np.linspace(.0, 1.00, int(np.round((1.00 - .0) / .01)) + 1, endpoint=True)
        self.maxDets = [1, 10, 100]
        self.areaRng = [[0 ** 2, 1e5 ** 2], [0 ** 2, 32 ** 2], [32 ** 2, 96 ** 2], [96 ** 2, 1e5 ** 2]]
        self.areaRngLbl = ['all', 'small', 'medium', 'large']
        self.useCats = 1

    def setKpParams(self):
        self.imgIds = []
        self.catIds = []
        # np.arange causes trouble.  the data point on arange is slightly larger than the true value
        self.iouThrs = np.linspace(.5, 0.95, int(np.round((0.95 - .5) / .05)) + 1, endpoint=True)
        self.recThrs = np.linspace(.0, 1.00, int(np.round((1.00 - .0) / .01)) + 1, endpoint=True)
        self.maxDets = [20]
        self.areaRng = [[0 ** 2, 1e5 ** 2], [32 ** 2, 96 ** 2], [96 ** 2, 1e5 ** 2]]
        self.areaRngLbl = ['all', 'medium', 'large']
        self.useCats = 1

    def __init__(self, iouType='segm'):
        if iouType == 'segm' or iouType == 'bbox':
            self.setDetParams()
        elif iouType == 'keypoints':
            self.setKpParams()
        else:
            raise Exception('iouType not supported')
        self.iouType = iouType
        # useSegm is deprecated
        self.useSegm = None

这四行的代码

self.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)
self.recThrs = np.linspace(.0, 1.00, np.round((1.00 - .0) / .01) + 1, endpoint=True)
.......
self.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)
self.recThrs = np.linspace(.0, 1.00, np.round((1.00 - .0) / .01) + 1, endpoint=True)

np.round()会返回float类型,但是numpy1.11以上的版本中,np.linspace不支持float类型参数,所以需要类型转换。

解决方法
如果不想修改代码,可以把numpy降级为1.11,但是不推荐这么做,还是推荐修改源码,对np.round()前面加一个int型转换就好了,源码位置在 lib/python3.6/site-packages/pycocotools/cocoeval.py
具体如下:

self.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)
self.recThrs = np.linspace(.0, 1.00, np.round((1.00 - .0) / .01) + 1, endpoint=True)
......
self.iouThrs = np.linspace(.5, 0.95, np.round((0.95 - .5) / .05) + 1, endpoint=True)
self.recThrs = np.linspace(.0, 1.00, np.round((1.00 - .0) / .01) + 1, endpoint=True)

改为

self.iouThrs = np.linspace(.5, 0.95, int(np.round((0.95 - .5) / .05)) + 1, endpoint=True)
self.recThrs = np.linspace(.0, 1.00, int(np.round((1.00 - .0) / .01)) + 1, endpoint=True)
......
self.iouThrs = np.linspace(.5, 0.95, int(np.round((0.95 - .5) / .05)) + 1, endpoint=True)
self.recThrs = np.linspace(.0, 1.00, int(np.round((1.00 - .0) / .01)) + 1, endpoint=True)

注意是四行都要改哦,不要有遗漏~

问题二:模型评估阶段,读取预测结果时传的列表为空

错误信息

Traceback (most recent call last):
  File "train.py", line 174, in 
    args.checkpoint_after, args.val_after)
  File "train.py", line 133, in train
    evaluate(val_labels, val_output_name, val_images_folder, net)
  File "/media/lab349/dbdb0721-5114-4f63-982f-3cfbad742077/JamesChen/lightweight-human-pose-estimation.pytorch-master/val.py", line 160, in evaluate
    run_coco_eval(labels, output_name)
  File "/media/lab349/dbdb0721-5114-4f63-982f-3cfbad742077/JamesChen/lightweight-human-pose-estimation.pytorch-master/val.py", line 22, in run_coco_eval
    coco_dt = coco_gt.loadRes(dt_file_path)
  File "/home/lab349/.conda/envs/zzcPose/lib/python3.6/site-packages/pycocotools/coco.py", line 325, in loadRes
    if 'caption' in anns[0]:
IndexError: list index out of range

原因
首先定位到coco.py报错的位置,是在cocoapi库的loadRes函数,找到loadRes源码如下:

def loadRes(self, resFile):
        """
        Load result file and return a result api object.
        :param   resFile (str)     : file name of result file
        :return: res (obj)         : result api object
        """
        res = COCO()
        res.dataset['images'] = [img for img in self.dataset['images']]

        print('Loading and preparing results...')
        tic = time.time()

        # Check result type in a way compatible with Python 2 and 3.
        if PYTHON_VERSION == 2:
            is_string =  isinstance(resFile, basestring)  # Python 2
        elif PYTHON_VERSION == 3:
            is_string = isinstance(resFile, str)  # Python 3
        if is_string:
            anns = json.load(open(resFile))
        elif type(resFile) == np.ndarray:
            anns = self.loadNumpyAnnotations(resFile)
        else:
            anns = resFile
        assert type(anns) == list, 'results in not an array of objects'
        annsImgIds = [ann['image_id'] for ann in anns]
        assert set(annsImgIds) == (set(annsImgIds) & set(self.getImgIds())), \
               'Results do not correspond to current coco set'
        if 'caption' in anns[0]:
            imgIds = set([img['id'] for img in res.dataset['images']]) & set([ann['image_id'] for ann in anns])
            res.dataset['images'] = [img for img in res.dataset['images'] if img['id'] in imgIds]
            for id, ann in enumerate(anns):
                ann['id'] = id+1
        elif 'bbox' in anns[0] and not anns[0]['bbox'] == []:
            res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])
            for id, ann in enumerate(anns):
                bb = ann['bbox']
                x1, x2, y1, y2 = [bb[0], bb[0]+bb[2], bb[1], bb[1]+bb[3]]
                if not 'segmentation' in ann:
                    ann['segmentation'] = [[x1, y1, x1, y2, x2, y2, x2, y1]]
                ann['area'] = bb[2]*bb[3]
                ann['id'] = id+1
                ann['iscrowd'] = 0
        elif 'segmentation' in anns[0]:
            res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])
            for id, ann in enumerate(anns):
                # now only support compressed RLE format as segmentation results
                ann['area'] = maskUtils.area(ann['segmentation'])
                if not 'bbox' in ann:
                    ann['bbox'] = maskUtils.toBbox(ann['segmentation'])
                ann['id'] = id+1
                ann['iscrowd'] = 0
        elif 'keypoints' in anns[0]:
            res.dataset['categories'] = copy.deepcopy(self.dataset['categories'])
            for id, ann in enumerate(anns):
                s = ann['keypoints']
                x = s[0::3]
                y = s[1::3]
                x0,x1,y0,y1 = np.min(x), np.max(x), np.min(y), np.max(y)
                ann['area'] = (x1-x0)*(y1-y0)
                ann['id'] = id + 1
                ann['bbox'] = [x0,y0,x1-x0,y1-y0]
        print('DONE (t={:0.2f}s)'.format(time.time()- tic))

        res.dataset['annotations'] = anns
        res.createIndex()
        return res

具体报错是在这一句

if 'caption' in anns[0]:

报错信息是 list index out of range,访问第0个元素超出了范围,说明列表 anns 为空。然后看anno这个东西,它是在val的阶段对coco的val数据集预测的结果,此时anns是空的,有一个可能是模型此时啥也没预测出来,然后发现还真是,因为使用了一个自己新改的网络,骨干和后面的refine模块全变了,所有没有加载预训练模型,模型从零开始训练,在刚开始的val阶段未检测出信息,导致空参数resFile传递给了loadRes,致使取anns访问元素报错。

解决方法
知道了问题在哪,解决起来就容易了,我这里直接设置模型刚开始训练时不进行模型评估的操作。

再插一嘴
排这个坑耽误了好久,一开始以为是环境问题,换了cocoapi的版本还是不行,然后以为是代码哪里改错了,又仔细跟之前的代码对照了一波发现一毛一样,然后网上各种乱找资料都没能解决,最后是在这里 https://github.com/RangiLyu/nanodet/issues/120 看到了一个老哥说:“It’s weird. Maybe because the model detected nothing in the Val dataset.” 突然意识到了可能是模型未预测出结果。

网上很多问这个的但是都没能解决我这个坑,我这里记录一下吧,以便大家如果遇到这个问题的话能多一个解决思路~

你可能感兴趣的:(科研,深度学习,pytorch)