torch之BatchNorm2D详解

  • 知乎上面有关各种Normalization算法理解
  • 简书上关于BatchNorm2d参数解释
  • note 11:BatchNorm2D官方手册,具体解析,如下:
    torch之BatchNorm2D详解_第1张图片
    torch之BatchNorm2D详解_第2张图片
    How to set learning rate as 0 in BN layer中所讲有关的参数affine理解如下:
Setting affine=False will remove the gamma and beta terms from the calculation, 
thus only using the running mean and var. So that’s basically, what you want.
I don’t know, how Caffe works, but setting the learning rate to 0 is something different in my opinion, 
since you still could have the gamma and beta terms with constant (random) values.

Also, be careful with the momentum argument, 
since it is different from the one used in optimizer classes and the conventional notion of momentum. 
Have a look at this note [11]. You probably want to lower it, i.e. setting it to something like 0.001.

If track_running_stats is set to True, the vanilla batch norm is used,
 i.e. the batch statistics are stored furing training in running_mean and running_var. 
 Setting the model to evaluation mode, will use these statistics for all validation samples.
Setting track_running_stats to False will use the batch statistics even in evaluation mode of the current batch.
  • torch中BatchNorm2D源码实现,以下部分为__init__函数:
    torch之BatchNorm2D详解_第3张图片

解析:

  • Parameters中文手册,详细内容见下:
    torch之BatchNorm2D详解_第4张图片
  • register_parameter(name, param)源码
    解析如下:
    torch之BatchNorm2D详解_第5张图片
  • register_buffer(name, tensor)源码
    解析如下:
    torch之BatchNorm2D详解_第6张图片

你可能感兴趣的:(torch)