「论文阅读笔记」Attentive Feedback Network for Boundary-Aware Salient Object Detection






  • FCN有很好的性能,但是边界模糊,这问题一直得不到很好的解决(从头到位一直在提)。虽然有CRF(条件随机场)这样的技术来进行一定的增强,但是增加了开销。
  • CNN的结构很多的池化和stride操作丢失了很多精细的信息,导致上采样也无法完全恢复。




  • 1.通过采用每个编码器块和相应的解码器块来构建的注意反馈模块(AFM),捕捉物体整体的结构。
  • 2.边界增强损失(BEL)用于产生精美的边界,帮助在目标轮廓上的显着性预测的学习。
  • 3.新的全局感知模块,使用分块堆叠后的卷积实现全局视野。



Coarse-to-fine solution.

Considering that simply concatenating features from different scales may fail if disordered by the ambiguous information, coarse-to-fine solutions are employed in recent state-of-the-art methods such as RefineNet [20], PiCANet [22] and RAS [5]. The authors address this limitation by introducing a recursive aggregation method which fuses the coarse features to generate high-resolution semantic features stage-by-stage. In this paper, we similarly integrate hierarchical features from coarse to fine scales by constructing skip-connections between scalematching encoder and decoder blocks. However, we think the weakness of the recursive aggregation method is that the coarse information may still mislead the finer one without proper guidance. Thus, we build Attentive Feedback Modules (AFMs) to guide the message passing among encoder and decoder blocks.

这一段作者讲述得十分清楚,首先是揭示了普遍存在的问题,如果单纯的把不同尺寸的特征图给concatenate(指的是连接,pytorch里有这个函数)一下,效果可想而知的会不好,因为这些特征图带有大量的不明确信息,其次跟着这个逻辑,作者列举出了当前比较领先的一些模型的处理方法,比如RefineNet , PiCANet 和 RAS,它们都是会采用一个递归聚合的方式解决这个问题,在这里作者只是简单采用其中的一种方式(skip connection),然而,作者认为一个好的“粗糙”特征图是不会有其他多余的信息,因为在聚合特征图时,多余信息会“误导”之后的细化操作,所以基于这一点,作者提出贡献点第一点,用AFM去逐级引导细化特征图。



  1. F作为输入,先经过Encoder的5层卷积块,然后输入GPM得到一个粗略图SG。
  2. 将SG按照右边AFM模块的结构逐级监督输入输出,最后得到224X224的F,最后两层还用了第二个贡献点BELloss去优化边界。

AFM(Attentive Feedback Module)



The AFM provides an opportunity for error corrections using a ternary attention map in the second time-step feedback stream. We introduce to provide credible templates of foreground and background for reference. A proper way for our end-to-end training strategy is to exploit the refined prediction S (l,1) in the first time-step as a reference. Reviewing the morphological dilation and erosion, the former can gain weight for lightly drawn figures, and the latter is a dual operation which allows the thicker figures to get skinny. Motivated by that, we can ease the negative effects on boundaries by thinning down the salient regions through erosion.



For l = 3, 4, 5, the loss function just contains the first term, i.e. the cross-entropy loss for saliency detection. It is because that these layers do not maintain the details needed for recovering exquisite outlines. By extracting boundaries from the saliency predictions themselves, the boundary-enhanced loss enhances the model to take more efforts on boundaries.



「论文阅读笔记」Attentive Feedback Network for Boundary-Aware Salient Object Detection_第1张图片

「论文阅读笔记」Attentive Feedback Network for Boundary-Aware Salient Object Detection_第2张图片

