Pytorch实现:Batch Normalization:批标准化

文章目录

  • 优点
  • 【BN计算方式】
  • 【Pytorch实现BN】
  • 【Code】
    • 【nn.BatchNorm1d】
    • 【nn.BatchNorm2d】

:一批数据,通常为mini-batch
标准化: 0均值, 1方差

优点

1、可以用更大学习率,加速模型收敛
2、可以不用精心设计权值初始化
3、可以不用dropout或较小的dropout
4、可以不用L2或者较小的weight decay
5、可以不用LRN(local response normalization)

【BN计算方式】

Pytorch实现:Batch Normalization:批标准化_第1张图片
affine transform:指 y i = λ x i + β y_i = \lambda x_i + \beta yi=λxi+β, λ 和 β \lambda 和 \beta λβ是可学习的。

【Pytorch实现BN】

Pytorch实现:Batch Normalization:批标准化_第2张图片Pytorch实现:Batch Normalization:批标准化_第3张图片Pytorch实现:Batch Normalization:批标准化_第4张图片

【Code】

【nn.BatchNorm1d】

# ======================================== nn.BatchNorm1d
import torch
import numpy as np
import torch.nn as nn
import sys, os

hello_pytorch_DIR = os.path.abspath(os.path.dirname(__file__)+os.path.sep+".."+os.path.sep+"..")
sys.path.append(hello_pytorch_DIR)
from PYTORCH.Deep_eye.Pytorch_Camp_master.hello_pytorch.tools.common_tools import set_seed

set_seed(1)  # 设置随机种子

flag = 1
# flag = 0
if flag:

    batch_size = 3
    num_features = 5  # 一个样本的特征数量
    momentum = 0.3    # 指数加权平均估计当前mean/var

    features_shape = (1)

    feature_map = torch.ones(features_shape)                                                    # 1D
    feature_maps = torch.stack([feature_map*(i+1) for i in range(num_features)], dim=0)         # 2D
    feature_maps_bs = torch.stack([feature_maps for i in range(batch_size)], dim=0)             # 3D
    print("input data:\n{} shape is {}".format(feature_maps_bs, feature_maps_bs.shape))

    bn = nn.BatchNorm1d(num_features=num_features, momentum=momentum)

    running_mean, running_var = 0, 1

    for i in range(2):
        outputs = bn(feature_maps_bs)

        print("\niteration:{}, running mean: {} ".format(i, bn.running_mean))
        print("iteration:{}, running var:{} ".format(i, bn.running_var))

        mean_t, var_t = 2, 0

        running_mean = (1 - momentum) * running_mean + momentum * mean_t # 初始时running_mean = 0
        running_var = (1 - momentum) * running_var + momentum * var_t

        print("iteration:{}, 第二个特征的running mean: {} ".format(i, running_mean))
        print("iteration:{}, 第二个特征的running var:{}".format(i, running_var))
input data:
tensor([[[1.],
         [2.],
         [3.],
         [4.],
         [5.]],
         
    [[1.],
     [2.],
     [3.],
     [4.],
     [5.]],

    [[1.],
     [2.],
     [3.],
     [4.],
     [5.]]]) shape is torch.Size([3, 5, 1])

iteration:0, running mean: tensor([0.3000, 0.6000, 0.9000, 1.2000, 1.5000]) 
iteration:0, running var:tensor([0.7000, 0.7000, 0.7000, 0.7000, 0.7000]) 
iteration:0, 第二个特征的running mean: 0.6 
iteration:0, 第二个特征的running var:0.7

iteration:1, running mean: tensor([0.5100, 1.0200, 1.5300, 2.0400, 2.5500]) 
iteration:1, running var:tensor([0.4900, 0.4900, 0.4900, 0.4900, 0.4900]) 
iteration:1, 第二个特征的running mean: 1.02 
iteration:1, 第二个特征的running var:0.48999999999999994

【nn.BatchNorm2d】

# ======================================== nn.BatchNorm2d
flag = 1
# flag = 0
if flag:

    batch_size = 3
    num_features = 6
    momentum = 0.3
    
    features_shape = (2, 2)

    feature_map = torch.ones(features_shape)                                                    # 2D
    feature_maps = torch.stack([feature_map*(i+1) for i in range(num_features)], dim=0)         # 3D
    feature_maps_bs = torch.stack([feature_maps for i in range(batch_size)], dim=0)             # 4D

    print("input data:\n{} shape is {}".format(feature_maps_bs, feature_maps_bs.shape))

    bn = nn.BatchNorm2d(num_features=num_features, momentum=momentum)

    running_mean, running_var = 0, 1

    for i in range(2):
        outputs = bn(feature_maps_bs)

        print("\niter:{}, running_mean.shape: {}".format(i, bn.running_mean.shape))
        print("iter:{}, running_var.shape: {}".format(i, bn.running_var.shape))

        print("iter:{}, weight.shape: {}".format(i, bn.weight.shape))
        print("iter:{}, bias.shape: {}".format(i, bn.bias.shape))
input data:
tensor([[[[1., 1.],
          [1., 1.]],

     [[2., 2.],
      [2., 2.]],

     [[3., 3.],
      [3., 3.]],

     [[4., 4.],
      [4., 4.]],

     [[5., 5.],
      [5., 5.]],

     [[6., 6.],
      [6., 6.]]],


    [[[1., 1.],
      [1., 1.]],

     [[2., 2.],
      [2., 2.]],

     [[3., 3.],
      [3., 3.]],

     [[4., 4.],
      [4., 4.]],

     [[5., 5.],
      [5., 5.]],

     [[6., 6.],
      [6., 6.]]],


    [[[1., 1.],
      [1., 1.]],

     [[2., 2.],
      [2., 2.]],

     [[3., 3.],
      [3., 3.]],

     [[4., 4.],
      [4., 4.]],

     [[5., 5.],
      [5., 5.]],

     [[6., 6.],
      [6., 6.]]]]) shape is torch.Size([3, 6, 2, 2])

iter:0, running_mean.shape: torch.Size([6])
iter:0, running_var.shape: torch.Size([6])
iter:0, weight.shape: torch.Size([6])
iter:0, bias.shape: torch.Size([6])

iter:1, running_mean.shape: torch.Size([6])
iter:1, running_var.shape: torch.Size([6])
iter:1, weight.shape: torch.Size([6])
iter:1, bias.shape: torch.Size([6])

你可能感兴趣的:(深度学习,python,pytorch,batch,深度学习)