卷积网络的反向传播

参考：
https://blog.csdn.net/Candy_GL/article/details/79470804
https://www.cnblogs.com/pinard/p/6494810.html
https://www.slideshare.net/mobile/kuwajima/cnnbp

这几篇文章写的不错，但总感觉让人迷惑。前两篇文章的描述中已知了池化层的误差求池化层前面一层的误差，按照我的理解

这个求解过程已经和下采样上采样没有关系了。
但是按照上面参考文章的说法，是对下采样的结果，那就说明层误差的计算没有经过下采样，也就是没有经过池化层，误差也是池化层之后层的误差，真正的池化层也在层（可能是卷积层+池化层）。这样理解的话就说得通了。
此时，

至于具体是什么就和层的类型有关了,。

对于卷积层的推导也一样，实际的卷积层也在层(这里求的就是卷积层)

如果是全连接层，这里的（这里假设激活函数是sigmoid）,但是这里是卷积层，是经过卷积又经过激活函数后的结果，就是，所以这里既要求激活函数的导数有要求卷积操作的导数。最后结果就是

具体形式的推导参考上面第二篇文章。
由上式可以看出卷积层的反向传播由卷积操作和点乘组成，没有新的运算类型。
对比Caffe base_conv_layer.cpp中的forward和backward代码发现，反向传播时确实是一个转置的卷积操作

template 
void BaseConvolutionLayer::forward_cpu_gemm(const Dtype* input,
    const Dtype* weights, Dtype* output, bool skip_im2col) {
  const Dtype* col_buff = input;
  if (!is_1x1_) {
    if (!skip_im2col) {
      conv_im2col_cpu(input, col_buffer_.mutable_cpu_data());
    }
    col_buff = col_buffer_.cpu_data();
  }
  for (int g = 0; g < group_; ++g) {
    caffe_cpu_gemm(CblasNoTrans, CblasNoTrans, conv_out_channels_ /
        group_, conv_out_spatial_dim_, kernel_dim_,
        (Dtype)1., weights + weight_offset_ * g, col_buff + col_offset_ * g,
        (Dtype)0., output + output_offset_ * g);
  }
}

template 
void BaseConvolutionLayer::backward_cpu_gemm(const Dtype* output,
    const Dtype* weights, Dtype* input) {
  Dtype* col_buff = col_buffer_.mutable_cpu_data();
  if (is_1x1_) {
    col_buff = input;
  }
  for (int g = 0; g < group_; ++g) {
    caffe_cpu_gemm(CblasTrans, CblasNoTrans, kernel_dim_,
        conv_out_spatial_dim_, conv_out_channels_ / group_,
        (Dtype)1., weights + weight_offset_ * g, output + output_offset_ * g,
        (Dtype)0., col_buff + col_offset_ * g);
  }
  if (!is_1x1_) {
    conv_col2im_cpu(col_buff, input);
  }
}

template<>
void caffe_cpu_gemm(const CBLAS_TRANSPOSE TransA,
    const CBLAS_TRANSPOSE TransB, const int M, const int N, const int K,
    const float alpha, const float* A, const float* B, const float beta,
    float* C) {
  int lda = (TransA == CblasNoTrans) ? K : M;
  int ldb = (TransB == CblasNoTrans) ? N : K;
  cblas_sgemm(CblasRowMajor, TransA, TransB, M, N, K, alpha, A, lda, B,
      ldb, beta, C, N);
}

卷积网络的反向传播

你可能感兴趣的:(卷积网络的反向传播)