卷积神经网络CNN的前向和后向传播(一)

卷积神经网络CNN的前向和后向传播

  • 卷积运算与相关的区别
  • 卷积运算的正向和反向传播

原文 Forward And Backpropagation in Convolutional Neural Network地址: https://medium.com/@2017csm1006/forward-and-backpropagation-in-convolutional-neural-network-4dfa96d7b37e 在墙外,在这写是为了方便大家参考。

下面的文章演示了卷积运算在CNN中进行反向传播的过程。

卷积运算与相关的区别

让我们考虑卷积层中的输入和卷积核(也称之为滤波器)。简单起见, c h a n n e l channel channel=1,输入矩阵 X X X为3x3,卷积核 F F F为2x2, p a d d i n g = 0 , s t r i d e = 1 padding=0,stride=1 padding=0,stride=1。前向传播为卷积过程。

卷积神经网络CNN的前向和后向传播(一)_第1张图片
而滤波器矩阵 F F F与输入矩阵 X X X的相关矩阵 O O O如下图所示:
卷积神经网络CNN的前向和后向传播(一)_第2张图片
卷积核与输入图像的卷积相当于将卷积核旋转180度(先水平翻转再垂直翻转),然后与输入矩阵进行相关操作:
卷积神经网络CNN的前向和后向传播(一)_第3张图片
由此可见,卷积运算与相关运算是一样的,只不过是和旋转后的卷积核进行相关运算。

卷积运算的正向和反向传播

注意: 为了方便推导出卷积核的值和输入矩阵值的梯度方程,我们将考虑卷积运算和相关运算看作是一样的,这只是为了处理上简单的考虑。
因此,卷积操作可以用下图来表示:
卷积神经网络CNN的前向和后向传播(一)_第4张图片

注意此处为了方便, F F F没有进行180度翻转,因此这里卷积和相关是一样的。
可以用下面的图来进行可视化。
卷积神经网络CNN的前向和后向传播(一)_第5张图片

现在,我们要计算卷积核 F F F 相对于误差 E E E 的梯度,要解以下的方程。

∂ E F 11 = ∂ E O 11 ⋅ ∂ O 11 F 11 + ∂ E O 12 ⋅ ∂ O 12 F 11 + ∂ E O 21 ⋅ ∂ O 21 F 11 + ∂ E O 22 ⋅ ∂ O 22 F 11 \frac{\partial E}{F_{11}}=\frac{\partial E}{O_{11}}·\frac{\partial O_{11}}{F_{11}}+\frac{\partial E}{O_{12}}·\frac{\partial O_{12}}{F_{11}}+\frac{\partial E}{O_{21}}·\frac{\partial O_{21}}{F_{11}}+\frac{\partial E}{O_{22}}·\frac{\partial O_{22}}{F_{11}} F11E=O11EF11O11+O12EF11O12+O21EF11O21+O22EF11O22

∂ E F 12 = ∂ E O 11 ⋅ ∂ O 11 F 12 + ∂ E O 12 ⋅ ∂ O 12 F 12 + ∂ E O 21 ⋅ ∂ O 21 F 12 + ∂ E O 22 ⋅ ∂ O 22 F 12 \frac{\partial E}{F_{12}}=\frac{\partial E}{O_{11}}·\frac{\partial O_{11}}{F_{12}}+\frac{\partial E}{O_{12}}·\frac{\partial O_{12}}{F_{12}}+\frac{\partial E}{O_{21}}·\frac{\partial O_{21}}{F_{12}}+\frac{\partial E}{O_{22}}·\frac{\partial O_{22}}{F_{12}} F12E=O11EF12O11+O12EF12O12+O21EF12O21+O22EF12O22

∂ E F 21 = ∂ E O 11 ⋅ ∂ O 11 F 21 + ∂ E O 12 ⋅ ∂ O 12 F 21 + ∂ E O 21 ⋅ ∂ O 21 F 21 + ∂ E O 22 ⋅ ∂ O 22 F 21 \frac{\partial E}{F_{21}}=\frac{\partial E}{O_{11}}·\frac{\partial O_{11}}{F_{21}}+\frac{\partial E}{O_{12}}·\frac{\partial O_{12}}{F_{21}}+\frac{\partial E}{O_{21}}·\frac{\partial O_{21}}{F_{21}}+\frac{\partial E}{O_{22}}·\frac{\partial O_{22}}{F_{21}} F21E=O11EF21O11+O12EF21O12+O21EF21O21+O22EF21O22

∂ E F 22 = ∂ E O 11 ⋅ ∂ O 11 F 22 + ∂ E O 12 ⋅ ∂ O 12 F 22 + ∂ E O 21 ⋅ ∂ O 21 F 22 + ∂ E O 22 ⋅ ∂ O 22 F 22 \frac{\partial E}{F_{22}}=\frac{\partial E}{O_{11}}·\frac{\partial O_{11}}{F_{22}}+\frac{\partial E}{O_{12}}·\frac{\partial O_{12}}{F_{22}}+\frac{\partial E}{O_{21}}·\frac{\partial O_{21}}{F_{22}}+\frac{\partial E}{O_{22}}·\frac{\partial O_{22}}{F_{22}} F22E=O11EF22O11+O12EF22O12+O21EF22O21+O22EF22O22

这一些等式也等同于
∂ E F 11 = ∂ E O 11 ⋅ X 11 + ∂ E O 12 ⋅ X 11 + ∂ E O 21 ⋅ X 11 + ∂ E O 22 ⋅ X 11 \frac{\partial E}{F_{11}}=\frac{\partial E}{O_{11}}·X_{11}+\frac{\partial E}{O_{12}}·X_{11}+\frac{\partial E}{O_{21}}·X_{11}+\frac{\partial E}{O_{22}}·X_{11} F11E=O11EX11+O12EX11+O21EX11+O22EX11

∂ E F 12 = ∂ E O 11 ⋅ X 12 + ∂ E O 12 ⋅ X 12 + ∂ E O 21 ⋅ X 12 + ∂ E O 22 ⋅ X 12 \frac{\partial E}{F_{12}}=\frac{\partial E}{O_{11}}·X_{12}+\frac{\partial E}{O_{12}}·X_{12}+\frac{\partial E}{O_{21}}·X_{12}+\frac{\partial E}{O_{22}}·X_{12} F12E=O11EX12+O12EX12+O21EX12+O22EX12

∂ E F 21 = ∂ E O 11 ⋅ X 21 + ∂ E O 12 ⋅ X 21 + ∂ E O 21 ⋅ X 21 + ∂ E O 22 ⋅ X 21 \frac{\partial E}{F_{21}}=\frac{\partial E}{O_{11}}·X_{21}+\frac{\partial E}{O_{12}}·X_{21}+\frac{\partial E}{O_{21}}·X_{21}+\frac{\partial E}{O_{22}}·X_{21} F21E=O11EX21+O12EX21+O21EX21+O22EX21

∂ E F 22 = ∂ E O 11 ⋅ X 22 + ∂ E O 12 ⋅ X 22 + ∂ E O 21 ⋅ X 22 + ∂ E O 22 ⋅ X 22 \frac{\partial E}{F_{22}}=\frac{\partial E}{O_{11}}·X_{22}+\frac{\partial E}{O_{12}}·X_{22}+\frac{\partial E}{O_{21}}·X_{22}+\frac{\partial E}{O_{22}}·X_{22} F22E=O11EX22+O12EX22+O21EX22+O22EX22

如果我们仔细看,这个等式可以写成卷积操作的形式:
卷积神经网络CNN的前向和后向传播(一)_第6张图片
类似地,我们可以得到输入矩阵 X X X 相对于误差 E E E 的梯度值:
卷积神经网络CNN的前向和后向传播(一)_第7张图片

为了得到输入矩阵的梯度 ∂ E / ∂ X \left. \partial E \middle/ \partial X \right. E/X,我们需要将卷积核旋转180度,通过输出的误差梯度 ∂ E / ∂ O \left. \partial E \middle/ \partial O \right. E/O计算旋转卷积核 F F F全卷积,如下图所示。
在这里插入图片描述

全卷积可以想象为执行如下图所示的过程。
卷积神经网络CNN的前向和后向传播(一)_第8张图片

因此,卷积运算就可以实现卷积层的正向传播和反向传播。

要计算池化层和Relu层的梯度,可以通过使用导数的链式法则来计算。

你可能感兴趣的:(CNN)