assign和sample是在anchor target中的核心操作。
assign一般基于IOU,mmdet中也有基于atss和基于point的等。
sample一般为随机,也有ohem的,基于伪标签的。
MaxIoUAssigner
这是常用的assign方法,下面为mmdet中该类介绍及构造函数参数说明
"""Assign a corresponding gt bbox or background to each bbox.
Each proposals will be assigned with `-1`, `0`, or a positive integer
indicating the ground truth index.
- -1: don't care
- 0: negative sample, no assigned gt
- positive integer: positive sample, index (1-based) of assigned gt
Args:
pos_iou_thr (float): IoU threshold for positive bboxes.
neg_iou_thr (float or tuple): IoU threshold for negative bboxes.
min_pos_iou (float): Minimum iou for a bbox to be considered as a
positive bbox. Positive samples can have smaller IoU than
pos_iou_thr due to the 4th step (assign max IoU sample to each gt).
gt_max_assign_all (bool): Whether to assign all bboxes with the same
highest overlap with some gt to that gt.
ignore_iof_thr (float): IoF threshold for ignoring bboxes (if
`gt_bboxes_ignore` is specified). Negative values mean not
ignoring any bboxes.
ignore_wrt_candidates (bool): Whether to compute the iof between
`bboxes` and `gt_bboxes_ignore`, or the contrary.
gpu_assign_thr (int): The upper bound of the number of GT for GPU
assign. When the number of gt is above this threshold, will assign
on CPU device. Negative values mean not assign on CPU.
"""
在核心函数assign中,主要负责对已有的anchors(参数bboxes)依据与gt_bboxes的IOU来确定bboxes的属性。下面是方法说明
"""Assign gt to bboxes.
This method assign a gt bbox to every bbox (proposal/anchor), each bbox
will be assigned with -1, 0, or a positive number. -1 means don't care,
0 means negative sample, positive number is the index (1-based) of
assigned gt.
The assignment is done in following steps, the order matters.
1. assign every bbox to -1
2. assign proposals whose iou with all gts < neg_iou_thr to 0
3. for each bbox, if the iou with its nearest gt >= pos_iou_thr,
assign it to that bbox
4. for each gt bbox, assign its nearest proposals (may be more than
one) to itself
Args:
bboxes (Tensor): Bounding boxes to be assigned, shape(n, 4).
gt_bboxes (Tensor): Groundtruth boxes, shape (k, 4).
gt_bboxes_ignore (Tensor, optional): Ground truth bboxes that are
labelled as `ignored`, e.g., crowd boxes in COCO.
gt_labels (Tensor, optional): Label of gt_bboxes, shape (k, ).
Returns:
:obj:`AssignResult`: The assign result.
"""
首先判断GT bboxes是否过多,多了就转到CPU去计算,不然显存会爆了。
接下来就是通过bbox_overlaps计算anchors与 gt bboxes之间的iou overlaps。
如果存在丢弃的gt bboxes和iof阈值被设置了,还要将丢弃一些anchor。
接下来就是依据IOU来对anchor赋值,通过assign_wrt_overlaps函数
我们看看assign_wrt_overlaps函数的实现思路
"""Assign w.r.t. the overlaps of bboxes with gts.
Args:
overlaps (Tensor): Overlaps between k gt_bboxes and n bboxes,
shape(k, n).
gt_labels (Tensor, optional): Labels of k gt_bboxes, shape (k, ).
num_gts, num_bboxes = overlaps.size(0), overlaps.size(1)
# 1. assign -1 by default
# assigned_gt_inds即anchor对应gt bbox的索引,-1代表无效anchor,0为负样本,正数为gt bbox的index
assigned_gt_inds = overlaps.new_full((num_bboxes, ),
-1,
dtype=torch.long)
# for each anchor, which gt best overlaps with it
# for each anchor, the max iou of all gts
max_overlaps, argmax_overlaps = overlaps.max(dim=0)
# for each gt, which anchor best overlaps with it
# for each gt, the max iou of all proposals
gt_max_overlaps, gt_argmax_overlaps = overlaps.max(dim=1)
# 2. assign negative: below
if isinstance(self.neg_iou_thr, float):
assigned_gt_inds[(max_overlaps >= 0)
& (max_overlaps < self.neg_iou_thr)] = 0
elif isinstance(self.neg_iou_thr, tuple):
assert len(self.neg_iou_thr) == 2
assigned_gt_inds[(max_overlaps >= self.neg_iou_thr[0])
& (max_overlaps < self.neg_iou_thr[1])] = 0
# 3. assign positive: above positive IoU threshold
pos_inds = max_overlaps >= self.pos_iou_thr
assigned_gt_inds[pos_inds] = argmax_overlaps[pos_inds] + 1
# 4. assign fg: for each gt, proposals with highest IoU
for i in range(num_gts):
if gt_max_overlaps[i] >= self.min_pos_iou:
if self.gt_max_assign_all:
max_iou_inds = overlaps[i, :] == gt_max_overlaps[i]
assigned_gt_inds[max_iou_inds] = i + 1
else:
assigned_gt_inds[gt_argmax_overlaps[i]] = i + 1
if gt_labels is not None:
assigned_labels = assigned_gt_inds.new_zeros((num_bboxes, ))
pos_inds = torch.nonzero(assigned_gt_inds > 0).squeeze()
if pos_inds.numel() > 0:
assigned_labels[pos_inds] = gt_labels[
assigned_gt_inds[pos_inds] - 1]
else:
assigned_labels = None
assign返回的数据为一个数据类AssignResult,包含num_gts, assigned_gt_inds, max_overlaps, labels=assigned_labels,这几项。
num_gts: gt bbox数量
assigned_gt_inds: anchor对应的label。-1:无视,0:负样本,正数:gt bbox对应的index
max_overlaps:anchor与gt bbox的最大iou
labels:pos bbox对应的label
附:IOF的解释
IOF,intersection of foreground
大部分我们用IOU,在NMS中可能用到IOF。例如得到两个同时预测同一个人的bbox,一个完全包围着人,另一个只预测了人的上半部分,这时候IOU<0.5,那只预测部分的bbox就在NMS后保留下了,而此时IOF就是1,如果用IOF>0.5来判断预测框重叠情况可能更有效。
sampler
一般随机采样用的比较多,也有基于OHEM的采样,在mmdet中还有PseudoSampler,PseudoSampler应用于不做采样的结构中,如retinanet,其实该采样就是将所有pos 和neg 样本提取出来。
mmdet中定义BaseSampler采样抽象类,定义了三个接口:_sample_pos,_sample_neg和sample分别为正样本,负样本和所有样本采样函数
sampler返回的是SamplingResult对象,包含pos_inds, neg_inds, bboxes, gt_bboxes,assign_result, gt_flags
pos_inds:pos anchor的索引
neg_inds:neg anchor的索引
bboxes:anchors