论文阅读:PolarMask: Single Shot Instance Segmentation with Polar Representation

文章目录

  • 1、论文总述
  • 2、极坐标系相对于笛卡尔坐标系的优势
  • 3、正样本的分配及其优势
  • 4、极坐标系的36个点到笛卡尔坐标系的转换
  • 5、Polar Centerness
  • 6、Polar IoU Loss的计算
  • 7、 Ablation Study
  • 8、 Comparison to state-of-the-art
  • 参考文献

1、论文总述

论文阅读:PolarMask: Single Shot Instance Segmentation with Polar Representation_第1张图片

这篇论文的作者在今年(2019)的10月10号将其挂在了arxiv网站上,应该投的是CVPR2020,还在审核中吧,从这里可以看出,要出论文得速度快,FCOS(PolarMask是基于FCOS改进)投的是ICCV2019,而ICCV的会议举办时间是在2019.10.27-11.3,就是说还没正式发表时候就已经出了成果。

我觉得这篇论文的意义很重要,这是我看到的第一篇将目标检测和实例分割用同一种建模方法表示的,就是目标分割是目标检测的通用情况,而目标检测是实例分割的特殊情况,作者达到这个目的是通过将目标用极坐标进行,例如:目标检测,通过一个中心点和box的4条边到这个中心点的距离进行表示;实例分割,通过用一个中心点和轮廓的36个点到这个中心点的距离进行表示(注:这36个点是每隔10度取一个点,相当于取了一圈)。

PolarMask的算法流程:它是基于FCOS改进的,是一种anchor-free类型的单阶段的实例分割算法,backbone是resnet+FPN,然后出来三个分支,第一个是对H×W×C进行分类,输出H×W×K,K是类别数;第二个分支是centerness map,输出H×W×1,判断这个点是中心点的概率,测试时候,分类分支和中心点分支进行相乘;第三个分支是回归分支,输出为H×W×N,N是射线的个数,它的值是这个点到中心点的距离,文中用的是36个射线,相当于用36个点编码了这个目标的mask;文中也有实验验证,就是加不加box分支对性能影响不大,我猜想正是因为,box的编码和mask的编码是同一种类型,所以加不加box分支影响不大!!

:这里贴一下论文一作在知乎上对这个论文的解读:PolarMask: 一阶段实例分割新思路

Thus, our aim is to design a conceptually simple mask
prediction module
that can be easily plugged into many offthe-shelf detectors, enabling instance segmentation

作者一再强调:这篇论文提出的算法的意义不在于性能,而在于将实例分割这样复杂的一个问题变得和单阶段的目标检测一样简洁!!!

We instantiate such an instance segmentation method by
using the recent object detector FCOS [25], mainly for its
simplicity. Note that, it is possible to use other detectors
such as RetinaNet [18], YOLO [23] with minimal modifi-
cation to our framework. Specifically, we propose PolarMask, formulating instance segmentation as instance center
classification and dense distance regression in a polar coordinate
, shown in Figure 2. The model takes an input image
and predicts the distance from a sampled positive location
(candidates of the instance center) to the instance contour
at each angle, and after assembling, outputs the final mask.
The overall pipeline of PolarMask is almost as simple and
clean as FCOS. It introduces negligible computation overhead. Simplicity and efficiency are the two key factors to
single shot instance segmentation, and PolarMask achieves
them successfully.

!!! In order to maximize the advantages of Polar Representation, we propose Polar Centerness and Polar IoU Loss to
deal with sampling high-quality center examples and optimization for dense distance regression, respectively. They
improve mask accuracy by about 15% relatively, showing
considerable gains under stricter localization metrics. Without bells and whistles, PolarMask achieves 32.9% in mask
mAP with single-model and single-scale training/testing on
the challenging COCO dataset [19]

2、极坐标系相对于笛卡尔坐标系的优势

In this work, we design instance segmentation methods
based on the Polar Representation since its inherent advantages are as follows:
(1) The origin point of the polar coordinate can be seen as the center of object.
(2) Starting from the origin point, the point in contour is determined by the
distance and angle.
(3) The angle is naturally directional
and makes it very convenient to connect the points into a
whole contour. We claim that Cartesian Representation may
exhibit first two properties similarly. However, it lacks the
advantage of the third property

3、正样本的分配及其优势

Center Samples
Location (x, y) is considered as a center sample
if it falls into areas around the mass-center of any instance.
Otherwise it is a negative sample. We define the
region for sampling positive pixels to be 1.5× strides [25]
of the feature map from the mass-center to left, top, right
and bottom. Thus each instance has about 9∼16 pixels near
the mass-center as center examples. (不太理解这个采样策略)
It has two advantages:
(1) Increasing the number of positive samples from 1 to
9∼16 can largely avoid imbalance of positive and negative
samples. Nevertheless, focal loss [18] is still needed when
training the classification branch.
(2) Mass-center may not
be the best center sample of an instance. More candidate
points make it possible to automatically find the best center
of one instance. We will discuss it in details in Section 3.3.

4、极坐标系的36个点到笛卡尔坐标系的转换

论文阅读:PolarMask: Single Shot Instance Segmentation with Polar Representation_第2张图片

We apply NMS to remove redundant masks. To fasten
the process, We calculate the smallest bounding boxes of
masks and then apply NMS based on the IoU of boxes. We
verify that such a simplified post-processing do not negatively effect the final mask performance. (最后用NMS的时候用的最小包围矩形)

5、Polar Centerness

论文阅读:PolarMask: Single Shot Instance Segmentation with Polar Representation_第3张图片

这里备注一下FCOS里的centerness:

在这里插入图片描述

6、Polar IoU Loss的计算

论文阅读:PolarMask: Single Shot Instance Segmentation with Polar Representation_第4张图片

论文阅读:PolarMask: Single Shot Instance Segmentation with Polar Representation_第5张图片

看一下最底下的(5)式,这个最大和最小,只是倆值相对比较,一个真值一个预测值

Our proposed Polar IoU Loss exhibits two advantageous
properties: (Polar IoU Loss 损失函数的优点)
(1) It is differentiable, enabling back propagation; and is very easy to implement parallel computations,
thus facilitating a fast training process.
(2) It predicts the
regression targets as a whole. It improves the overall performance by a large margin compared with smooth-l1 loss,
shown in our experiments.
(3) As a bonus, Polar IoU Loss
is able to automatically keep the balance between classifi-
cation loss and regression loss of dense distance prediction.
We will discuss it in detail in our experiments

7、 Ablation Study

论文阅读:PolarMask: Single Shot Instance Segmentation with Polar Representation_第6张图片

8、 Comparison to state-of-the-art

论文阅读:PolarMask: Single Shot Instance Segmentation with Polar Representation_第7张图片
其中,TensorMak是有何凯明大神魔法加持的,后面得看看。

参考文献

1、PolarMask: 一阶段实例分割新思路

你可能感兴趣的:(论文阅读,目标检测)