[Paper note] Xception: Deep Learning with Depthwise Separable Convolutions

  • paper

Intuition

  • Inception series
  • Conv maps cross-channel correlation and spatial correlation at the same time.
  • Inception module makes this process easier and more efficient by explicitly factoring it into a series of operations that would independently look at cross-channel correlations and at spatial correlations.
  • 1x1 Conv -> cross-channel correlation; 3x3 & 5x5 Conv -> spatial correlation.
    [Paper note] Xception: Deep Learning with Depthwise Separable Convolutions_第1张图片
  • An extreme version of this separation is to entirely decouple the cross-channel and spatial operations, naming Xception.

Xception

  • Xception module:
    [Paper note] Xception: Deep Learning with Depthwise Separable Convolutions_第2张图片
  • First use 1x1 Conv
  • Conduct depthwise separable convolution (DSC): each feature-map have different 3x3 Conv, then concatenate the result of each Conv.
  • Advantages: Efficient parameter usage
  • Whole model
    [Paper note] Xception: Deep Learning with Depthwise Separable Convolutions_第3张图片

Experiment

  • Dataset: JET (internal Google dataset), ImageNet, FastEval14k.
  • Result
    • Xception converges faster than Inception V3 and gets higher accuracy.
    • 21.0% top-1, 5.5% top-5 error on ImageNet.
    • Better with residual connection.
    • Worse with non-linear in between the 1x1 and DSC.

你可能感兴趣的:(paper-note)