Deeplabv3+ 阅读笔记

Notes: 谷歌deeplabv3+的代码现在已经开源,详见deeplab(Github),还有一个使用的demo样例。

0.

spatial pyramid pooling
  • probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view
  • encode multi-scale contextual information
encode-decoder
  • gradually recovering the spatial information
  • capture sharper object boundaries
convolution
  • depthwise convolution: a spatial convolution performed independently over each channel of an input
  • pointwise convolution: a 1x1 convolution, projecting the channels output by the depthwise convolution onto a new channel space
  • differences:
    • Inception: 1x1 conv first
    • depthvise separable convolution: channel-wise first

details can be found here.

contributes
  1. decoder module: refine the segmentation results especially along object boundaries
  2. depth-wise separable convolution
    • Atrous Spatial Pyramid Pooling
    • decoder

3. Methods

  • capture multi-scale context
    Deeplabv3+ 阅读笔记_第1张图片

Deeplabv3+ 阅读笔记_第2张图片

Deeplabv3+ 阅读笔记_第3张图片

Deeplabv3+ 阅读笔记_第4张图片

你可能感兴趣的:(paper阅读,AI)