此文转载于外文网站
https://medium.com/@2017csm1006/forward-and-backpropagation-in-convolutional-neural-network-4dfa96d7b37e
文中对卷积神经网络的反向传播的计算过程,使用了动图表示,理解起来更简单明了。
The below post demonstrates the use of convolution operation for carrying out the back propagation in a CNN.
Let’s consider the input and the filter that is going to be used for carrying out the convolution as given above.
Then the correlation of the filter matrix with the input matrix is described in the figure below
Now, The convolution of the filter matrix with input image is same as rotating the filter by 180 degrees and then carrying out the correlation of the rotated filter matrix with the input matrix.
As can be seen from the above image the convolution operation is same as that of the correlation operation but with rotated filter.
Note : To derive the equation of the gradients for the filter values and the input matrix values we will consider that the convolution operation is same as correlation operation, just for simplicity.
Therefore, The convolution operation can be written as described in the figure below.
Notice that here the filter is not rotated for sake of simplicity, therefore here convolution is same as correlation
It can be visualized in the figure below.
Here, E is the error obtained.
Now, to calculate the gradients of filter ‘F’ with respect to the error ‘E’, following equations needs to solved.
which evaluates to
If we look closely this above equation can be written in form of our convolution operation.
Similarly we can find the gradients of the input matrix ‘X’ with respect to the error ‘E’.
Now, the above computation can be obtained by a different type of convolution operation known as full convolution. In order to obtain the gradients of the input matrix we need to rotate the filter by 180 degree and calculate the full convolution of the rotated filter by the gradients of the output with respect to error, As represented in the image below.
The full convolution can be visualized as carrying out the procedure as represented in the figure below.
Here ‘δX’ represents the gradients of error with respect to X
Hence both the forward and backward propagation can be performed using the convolution operation.
For calculating the gradients of the pooling and Relu layers the gradients can be calculated by following the same procedure of using chain rule of derivatives.