目标检测正负样本分配之MaxIoUAssigner

文章目录

  • 前言
  • 1、MaxIOUAssigner
  • 总结


前言

记录下目标检测正负样本的分配方法,用的是mmdetection里的代码

1、MaxIOUAssigner

匹配规则这篇文章写的很清楚,我在这里整理下,详细的可以看这篇文章:https://zhuanlan.zhihu.com/p/346198300
(1) 初始化所有 anchor 为忽略样本,将每个anchor的索引值设置为-1,表示为忽略anchor
(2) 计算背景样本,将每个anchor和所有gt的iou的最大Iou (3) 计算高质量正样本,对于每个gt,计算其和所有gt的iou,选取最大的iou对应的gt位置,如果其最大iou大于等于pos_iou_thr,则设置该anchor的mask设置为gt框的索引值+1,表示该anchor负责预测该gt bbox,是高质量anchor
(4) 适当增加更多正样本,第3的设置可能会出现某些gt没有分配到对应的anchor(由于iou低于pos_iou_thr),故下一步对于每个gt还需要找出和最大iou的anchor位置,如果其iou大于min_pos_iou,将该anchor的mask设置为gt框的索引值+1,表示该anchor负责预测对应的gt。通过本步骤,可以最大程度保证每个gt都有anchor负责预测,如果还是小于min_pos_iou,那就没办法了,只能当做忽略样本了。从这一步可以看出,3和4有部分anchor重复分配了,即当某个gt和anchor的最大iou大于等于pos_iou_thr,那肯定大于min_pos_iou,此时3和4步骤分配的同一个anchor。

  • 从上面4步分析,可以发现每个gt可能和多个anchor进行匹配,每个anchor不可能存在和多个gt匹配的场景。在第4步中,每个gt最多只会和某一个anchor匹配,但是实际操作时候为了多增加一些正样本,通过参数gt_max_assign_all可以实现某个gt和多个anchor匹配场景。通常第4步引入的都是低质量anchor,网络训练有时候还会带来噪声,可能还会起反作用。

  • 就是说:如果anchor和gt的iou低于neg_iou_thr的,那就是负样本;如果某个anchor和其中一个gt的最大iou大于pos_iou_thr,那么该anchor就负责对应的gt;如果某个gt和所有anchor的iou中最大的iou会小于pos_iou_thr,但是大于min_pos_iou,则依然将该anchor负责对应的gt;其余的anchor全部当做忽略区域。

  • 开启低质量匹配模式,可以保证高召回率(样本数多了),但是带来了低质量的框(其中一些高质量的可能会被低质量的代替)。一般在二阶段的第一阶段会开启,后面就不开启了。一阶段,看情况,一般不开启。

看看下面的代码,有一些步骤的详细注释,
输入:overlaps:GT与所有anchor之间的IOU,gt_labels:GT框的label

    def assign_wrt_overlaps(self, overlaps, gt_labels=None):
        """Assign w.r.t. the overlaps of bboxes with gts.

        Args:
            overlaps (Tensor): Overlaps between k gt_bboxes and n bboxes,
                shape(k, n).
            gt_labels (Tensor, optional): Labels of k gt_bboxes, shape (k, ).

        Returns:
            :obj:`AssignResult`: The assign result.
        """		# overlaps[k,n]
        num_gts, num_bboxes = overlaps.size(0), overlaps.size(1) # gt数量,anchor数量

        # 1. assign -1 by default,先全部初始化为-1
        assigned_gt_inds = overlaps.new_full((num_bboxes, ),
                                             -1,
                                             dtype=torch.long)
		# 判断是否有目标,没有的话gt=0,labels=None或者-1
        if num_gts == 0 or num_bboxes == 0:
            # No ground truth or boxes, return empty assignment
            max_overlaps = overlaps.new_zeros((num_bboxes, ))
            if num_gts == 0:
                # No truth, assign everything to background
                assigned_gt_inds[:] = 0
            if gt_labels is None:
                assigned_labels = None
            else:
                assigned_labels = overlaps.new_full((num_bboxes, ),
                                                    -1,
                                                    dtype=torch.long)
            return AssignResult(
                num_gts,
                assigned_gt_inds,
                max_overlaps,
                labels=assigned_labels)

        # for each anchor, which gt best overlaps with it
        # for each anchor, the max iou of all gts,max_overlaps[n,],argmax_overlaps[n,]
        max_overlaps, argmax_overlaps = overlaps.max(dim=0) # 找出anchor与所有GT的最大IOU和对应GT的索引
        # for each gt, which anchor best overlaps with it
        # for each gt, the max iou of all proposals,gt_max_overlaps[k,],gt_argmax_overlaps[k,]
        gt_max_overlaps, gt_argmax_overlaps = overlaps.max(dim=1)  # 找出GT与anchor的最大IOU和对应anchor的索引
        # gt_argmax_overlaps:tensor([172633, 157411, 160394, 160536, 173559, 156765, 159719, 151235, 184116,158785], device='cuda:0')
        # 2. assign negative: below
        # the negative inds are set to be 0
        if isinstance(self.neg_iou_thr, float):
            assigned_gt_inds[(max_overlaps >= 0) # 负样本=0
                             & (max_overlaps < self.neg_iou_thr)] = 0
        elif isinstance(self.neg_iou_thr, tuple):
            assert len(self.neg_iou_thr) == 2
            assigned_gt_inds[(max_overlaps >= self.neg_iou_thr[0])
                             & (max_overlaps < self.neg_iou_thr[1])] = 0

        # 3. assign positive: above positive IoU threshold
        pos_inds = max_overlaps >= self.pos_iou_thr # [n,]-[pos,]
        assigned_gt_inds[pos_inds] = argmax_overlaps[pos_inds] + 1 # (0-num_gt)+1,gt框的索引+1
        if self.match_low_quality: # 是否开启低质量匹配模式
            # Low-quality matching will overwrite the assigned_gt_inds assigned
            # in Step 3. Thus, the assigned gt might not be the best one for
            # prediction.
            # For example, if bbox A has 0.9 and 0.8 iou with GT bbox 1 & 2,
            # bbox 1 will be assigned as the best target for bbox A in step 3.
            # However, if GT bbox 2's gt_argmax_overlaps = A, bbox A's
            # assigned_gt_inds will be overwritten to be bbox B.
            # This might be the reason that it is not used in ROI Heads.
            for i in range(num_gts): # 找每个gt与所有anchor的最大值
                if gt_max_overlaps[i] >= self.min_pos_iou: # 如果最大值>最小的正样本阈值
                    if self.gt_max_assign_all:	# 
                        max_iou_inds = overlaps[i, :] == gt_max_overlaps[i] #gt与所有anchor中的最大IOU的一个anchor为True
                        assigned_gt_inds[max_iou_inds] = i + 1	# gt的索引+1
                    else:
                        assigned_gt_inds[gt_argmax_overlaps[i]] = i + 1	#

        if gt_labels is not None:
            assigned_labels = assigned_gt_inds.new_full((num_bboxes, ), -1)	# 初始化为-1
            pos_inds = torch.nonzero(
                assigned_gt_inds > 0, as_tuple=False).squeeze()	# 取出非零值的索引
            if pos_inds.numel() > 0:
                assigned_labels[pos_inds] = gt_labels[
                    assigned_gt_inds[pos_inds] - 1]  # 上面加了1,这里减去1得到gt的Label
        else:
            assigned_labels = None

        return AssignResult(
            num_gts, assigned_gt_inds, max_overlaps, labels=assigned_labels)

总结

1.先初始化所有样本为-1;
2.计算每个anchor与所有gt的IOU,将IOU大的值和索引保存,符合负样本阈值的为0,符合正样本的为gt的索引,其他的为忽略样本(这里可能会有gt匹配不到anchor,因为IOU低于正样本的阈值);
3.计算每个gt和所有anchor的IOU,取IOU最大的值和索引保存,与2比较,如果相等则覆盖还是一样,如果出现一个anchor能匹配两个gt的情况,后面的则会覆盖掉前面的,这样就会造成前面的gt没有anchor匹配,如果前面一个IOU还要大却被覆盖了,这样就造成了低质量的匹配。(这里其实我不知道为什么不去进行判断取舍呢,有知道的希望可以评论区或者私信交流交流)

你可能感兴趣的:(目标检测,深度学习,计算机视觉)