【DeepLearning-Note】Implementation of Convolutiona Netural Network

Basic Knowledge

1 INPUT[32*32*3]*filter[3*3*3]=feature map[(32-3+1)*(32-3+1)]=feature map[30*30]
卷积操作是点乘相加,INPUT[width*height*depth/channel]
                  filter[width_*height_*depth/channel]
                  INPUT和filter中的depth相对应
                  生成的feature map的个数与filter的个数相等
2 Local Connectivity
hyperparameters:receptive field(the filter size):每个神经元连接到输入变量的局部区域的空间范围。在卷积操作中,连接在空间上(沿着高度和宽度)是局部的,但始终沿着输入
体积的整个深度。
3 Spatial arrangement(Output volume)
three hyperparameters:
(1)depth:correspond to the number of filters(each learning to look for something different in the input)
(2)stride
(3)zero-padding:use it to preserve the spatial size fo the input volume so the input and output width and heigth are the same.

Realization of Convolutiona Netural Network

  • Input W1*H1*D1
  • hyperparameters:the number of filters K,spaital extent F ,stride S,zero-padding P
  • Output W2*H2*D2:
      W2=(W1-F+2P)/S+1
      H2=(H1-F+2P)/S+1

      D2=K


Implementation as Matrix Multiplication(from CS231)

  • ForwardStretch the input image into columns in an operation called im2col.

    1、X_col

    input image[227*227*3]

   filter[11*11*3],stride=4  =>strech each block into a colunm of vector of size 11*11*3=363

   output (227-11)/4=55 =>55*55=3025

   X_col is a matrix of im2col of size[363*3025](363:each receptive field,3025:the output map unit)

   2、W_row

   The weights of the CONV layer are stretched out into rows.

   the number of the filters is 96.

   W_row is a matrix of size [96*363] 

   3、The result of a convolution is matrix multiply (dot product) np.dot(W_row,X_col)=output [96*3025]

   downside:use a lot of mermory,since some values in the input volume are replicated multiple times int X_col.

   benefit:there are many efficient implementations of Matrix Multiplication.


你可能感兴趣的:(卷积神经网络)