pytorch模型变量初始化

1、简介

神经网络的训练过程中的参数学习是基于梯度下降法进行优化的。梯度下降法需要在开始训练时给每一个参数赋一个初始值。这个初始值的选取十分关键。一般我们希望数据和参数的均值都为 0,输入和输出数据的方差一致。在实际应用中,参数服从高斯分布或者均匀分布都是比较有效的初始化方式。

深度学习模型参数初始化的方法

(1)Gaussian 满足mean=0,std=1的高斯分布x∼N(mean,std2)

(2)Xavier 满足x∼U(−a,+a)x∼U(−a,+a)的均匀分布, 其中 a = sqrt(3/n)

(3)MSRA 满足x∼N(0,σ2)x∼N(0,σ2)的高斯分布,其中σ = sqrt(2/n)

(4)Uniform 满足min=0,max=1的均匀分布。x∼U(min,max)x∼U(min,max)

vgg16模型模型为例子

vgg16模型代码

import math

import torch
import torch.nn as nn
from torch.autograd import Variable


__all__ = ['vgg']

defaultcfg = {
    11 : [64, 'M', 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512],
    13 : [64, 64, 'M', 128, 128, 'M', 256, 256, 'M', 512, 512, 'M', 512, 512],
    16 : [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 'M', 512, 512, 512, 'M', 512, 512, 512],
    19 : [64, 64, 'M', 128, 128, 'M', 256, 256, 256, 256, 'M', 512, 512, 512, 512, 'M', 512, 512, 512, 512],
}

class vgg(nn.Module):
    def __init__(self, dataset='cifar10', depth=19, init_weights=True, cfg=None):
        super(vgg, self).__init__()
        if cfg is None:
            cfg = defaultcfg[depth]

        self.cfg = cfg

        self.feature = self.make_layers(cfg, True)

        if dataset == 'cifar10':
            num_classes = 10
        elif dataset == 'cifar100':
            num_classes = 100
        self.classifier = nn.Sequential(
              nn.Linear(cfg[-1], 512),
              nn.BatchNorm1d(512),
              nn.ReLU(inplace=True),
              nn.Linear(512, num_classes)
            )
        if init_weights:
            self._initialize_weights()

    def make_layers(self, cfg, batch_norm=False):
        layers = []
        in_channels = 3
        for v in cfg:
            if v == 'M':
                layers += [nn.MaxPool2d(kernel_size=2, stride=2)]
            else:
                conv2d = nn.Conv2d(in_channels, v, kernel_size=3, padding=1, bias=False)
                if batch_norm:
                    layers += [conv2d, nn.BatchNorm2d(v), nn.ReLU(inplace=True)]
                else:
                    layers += [conv2d, nn.ReLU(inplace=True)]
                in_channels = v
        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.feature(x)
        x = nn.AvgPool2d(2)(x)
        x = x.view(x.size(0), -1)
        y = self.classifier(x)
        return y

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
                m.weight.data.normal_(0, math.sqrt(2. / n))
                if m.bias is not None:
                    m.bias.data.zero_()
            elif isinstance(m, nn.BatchNorm2d):
                m.weight.data.fill_(0.5)
                m.bias.data.zero_()
            elif isinstance(m, nn.Linear):
                m.weight.data.normal_(0, 0.01)
                m.bias.data.zero_()

if __name__ == '__main__':
    net = vgg()
    x = Variable(torch.FloatTensor(16, 3, 40, 40))
    y = net(x)
    print(y.data.shape)

初始化代码

pytorch文档中给出的VGG源码中使用的初始化方式。可以说只要是CNN,都可以采用该方式进行初始化。

def _initialize_weights(self):
      for m in self.modules():
          if isinstance(m, nn.Conv2d):
              n = m.kernel_size[0] * m.kernel_size[1] * m.out_channels
              m.weight.data.normal_(0, math.sqrt(2. / n))
              if m.bias is not None:
                  m.bias.data.zero_()
          elif isinstance(m, nn.BatchNorm2d):
              m.weight.data.fill_(0.5)
              m.bias.data.zero_()
          elif isinstance(m, nn.Linear):
              m.weight.data.normal_(0, 0.01)
              m.bias.data.zero_()

初始化的方式,采用的是正太分布

m.weight.data.normal_(0, math.sqrt(2. / n))
  • 参考文献

(原)模型的参数初始化 - darkknightzh - 博客园
https://www.cnblogs.com/darkknightzh/p/8297793.html

Pytorch 细节记录 - 三年一梦 - 博客园
https://www.cnblogs.com/king-lps/p/8570021.html

深度学习模型参数初始化的方法 - 下路派出所 - 博客园
https://www.cnblogs.com/callyblog/p/9714656.html

深度学习模型参数初始化的方法 - 下路派出所 - 博客园
https://www.cnblogs.com/callyblog/p/9714656.html

PyTorch学习系列(九)——参数_初始化 - CodeTutor - CSDN博客
https://blog.csdn.net/VictoriaW/article/details/72872036

PyTorch常用的初始化和正则 - 简书
https://www.jianshu.com/p/902bb29209ed

Pytorch模型训练(2) - 模型初始化 - Mingx9527 - CSDN博客
https://blog.csdn.net/u011681952/article/details/86579998#11__20

深度学习中的参数初始化 - Man - CSDN博客
https://blog.csdn.net/mzpmzk/article/details/79839047

其他一些待看文章
pytorch加载模型和初始化权重_码神岛
https://msd.misuland.com/pd/2884250137616453910

pytorch中的参数初始化方法总结 - ys1305的博客 - CSDN博客
https://blog.csdn.net/ys1305/article/details/94332007

[PyTorch]PyTorch中模型的参数初始化的几种方法(转) - 向前奔跑的少年 - 博客园
https://www.cnblogs.com/kk17/p/10088301.html#_labelTop

pytorch权重初始化(2) - CS_lcylmh的CSDN博客 - CSDN博客
https://blog.csdn.net/qq_19598705/article/details/80935786

Search results for ‘data.normal_ topic:157’ - PyTorch Forums
https://discuss.pytorch.org/search?q=data.normal_%20topic%3A157

Weight initilzation - PyTorch Forums
https://discuss.pytorch.org/t/weight-initilzation/157

你可能感兴趣的:(S_深度学习)