基于`affine-BN-ReLu`的简易神经网络详解

  • 首先来解析一下什么是Batch Normalization
  • batch normalizations(批量归一化):
    • 公式:
基于`affine-BN-ReLu`的简易神经网络详解_第1张图片
BNforward.png
基于`affine-BN-ReLu`的简易神经网络详解_第2张图片
BNbackward.png
 - forward详解:
     - `input = {X,gamma,beta,eps}`
     - 取X的均值和方差`sample_mean`和`sample_var`
     - 对X进行正则化`(normalize)`
     - 通过`sample_mean`和`sample_var`更新在test中运行的值
        `running_mean = momentum * running_mean + (1-momentum)*sample_mean`
        `running_var = momentum * running_var + (1-momentum)*sample_var`
- backward详解:
     - `input = {dout cache}`
     - 按照公式反推(表达式很简单)
                  
            x_normalized,gamma,beta,sample_mean,sample_var,x,eps = cache
             N,D = x.shape
             dx_normalized = dout * gamma
             x_mu = x-sample_mean
             sample_std_inv = 1.0/np.sqrt(sample_var+eps)
             dsample_var = -0.5*np.sum(dx_normalized*x_mu,axis=0,keepdims=True)*sample_std_inv**3
             dsample_mean = -1.0*np.sum(dx_normalized*sample_std_inv,axis=0,keepdims=True) - \
                          2.0*dsample_var*np.mean(x_mu,axis=0,keepdims=True)
             dx1 = dx_normalized * sample_std_inv
             dx2 = 2.0/N*dsample_var*x_mu
             dx = dx1+dx2+1.0/N*dsample_mean
             dgamma = np.sum(dout*x_normalized,axis=0,keepdims=True)
             dbeta = np.sum(dout,axis=0,keepdims=True)
  • 基于affine-BN-ReLu的简易神经网络详解

基于`affine-BN-ReLu`的简易神经网络详解_第3张图片
MultilayerNeuralNetwork.png
  • 使用SGD,每次的输入是一个(N,D)的矩阵X
  • 经历两个隐藏层,每个隐藏层相当于全连接层-Batch Normalization层-ReLu层
  • 输出损失函数可以采用softmax或者svm hinge函数
  • 反向传播按照之前解释过的affine,BN,ReLu如何反向求导即可

你可能感兴趣的:(基于`affine-BN-ReLu`的简易神经网络详解)