nn.Parameter:张量子类,表示可学习参数,如weight,bias
nn.Module:所有网络层基类,管理网络属性
nn.functional:函数具体实现,如卷积,池化,激活函数等
nn.init:参数初始化方法
·parameters:存储管理nn.Parameter类(包括权值、偏置等)
·modules:存储管理nn.Module类(管理各种子模块,卷积层、池化层)
·buffers:存储管理缓冲属性,如BN层中的running_mean
·***_hooks:存储管理钩子函数(有五个)
self._parameters=OrderedDict()
self._buffers=OrderedDict()
self._backward hooks=OrderedDict()
self._forward hooks=OrderedDict()
self._forward_pre_hooks=OrderedDict()
self._state_dict_hooks =OrderedDict()
self._load_state_dict_pre_hooks=OrderedDict()
self._modules=OrderedDict()
每个module都有8个字典管理它的属性
构建子模块代码示例:
import torch.nn as nn
import torch.nn.functional as F
class LeNet(nn.Module):
def __init__(self, num_classes):
super(LeNet, self).__init__()
self.conv1 = nn.Conv2d(3, 6, 5)
self.conv2 = nn.Conv2d(6, 16, 5)
self.fc1 = nn.Linear(16 * 5 * 5, 120)
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, num_classes)
拼接子模块代码示例(前向传播):
def forward(self, x):
out = F.relu(self.conv1(x))
out = F.max_pool2d(out, 2)
out = F.relu(self.conv2(out))
out = F.max_pool2d(out, 2)
out = out.view(out.size(0), -1)
out = F.relu(self.fcl(out))
out = F.relu(self.fc2(out))
out = self.fc3(out)
return out
注意:
要继承父类的初始化
super(LeNet, self).init()
nn.Sequetial:按顺序包装多个网络层
nn.ModuleList:像python的list一样包装多个网络层
nn.ModuleDict:像python的dict一样包装多个网络层
import torch.nn as nn
class LeNet(nn.Module):
def __init__(self, num_classes):
super(LeNet, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(3, 6, 5),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2),
nn.Conv2d(6, 16, 5),
nn.ReLU(),
nn.MaxPool2d(kernel_size=2, stride=2),)
self.classifier = nn.Sequential(
nn.Linear(16*5*5, 120),
nn.ReLU(),
nn.Linear(120, 84),
nn.ReLU(),
nn.Linear(84, num_classes),)
def forward(self, x):
x = self.features(x)
x = x.view(x.size()[0], -1)
x = self.classifier(x)
return x
append():在ModuleList后面添加网络层
extend():拼接两个ModuleList
insert():插入网络层
class net(nn.Module):
def __init__(self,num_classes):
super(net,self).__init__()
self.linears = nn.ModuleList([nn.Linear(10,10) for i in range(20)])
def forward(self,x):
for i,linears in enumerate(self.linears):
x = linears(x)
return x
a = torch.zeros((10,10))
net = net(1)
b = net(a)
print(b.shape)
import torch.nn as nn
import torch
class net(nn.Module):
def __init__(self,num_classes):
super(net,self).__init__()
self.choices = nn.ModuleDict({
'conv':nn.Conv2d(10,10,3),
'pool':nn.MaxPool2d(3)
})
self.activitions = nn.ModuleDict({
'relu':nn.ReLU(),
'prelu':nn.PReLU()
})
def forward(self,x):
x = self.choices['conv'](x)
x = self.activitions['relu'](x)
return x
a = torch.zeros((4,10,32,32))
net = net(1)
b = net(a)
print(b.shape)
卷积
nn.Conv2d(in_channels, out_channels, kernel_size, stride=1,
padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros' )
·in_channels:输入通道数
·out_channels:输出通道数,等价于卷积核个数
·kernel_size:卷积核尺寸
·stride:步长
·padding:填充个数
·dilation:空洞卷积大小
·groups:分组卷积设置
·bias:偏置
转置卷积
nn.ConvTranspose2d(in_channels, out_channels, kernel_size, stride=1,padding=0,
output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros' )
nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, return_indices=False, ceil_mode=False)
·kernel_size:池化核尺寸
·stride:步长
·padding:填充个数
·dilation:池化核间隔大小
·ceil_mode:尺寸向上取整
·return_indices:记录池化像素索引
nn.AvgPool2d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, divisor_override=None )
count_include_pad:填充值用于计算
divisor_override:除法因子
nn.MaxUnpool2d(kernel_size, stride=None, padding=0)
import torch.nn as nn
import torch
class net(nn.Module):
def __init__(self,num_classes):
super(net,self).__init__()
self.choices = nn.ModuleDict({
'conv':nn.Conv2d(10,10,3),
'pool':nn.MaxPool2d((2,2),stride=(2,2),return_indices=True),
'unpool':nn.MaxUnpool2d((2,2),stride=(2,2))
})
self.activitions = nn.ModuleDict({
'relu':nn.ReLU(),
'prelu':nn.PReLU()
})
def forward(self,x):
x,indices = self.choices['pool'](x)
x = self.choices['unpool'](x,indices)
x = self.activitions['relu'](x)
return x
nn.Linear(in_features, out_features, bias=True)
nn.Sigmoid() / nn.tanh() / nn.ReLU()
正确的初始化权值可以加快模型的收敛,而不恰当的权值初始化会导致梯度消失或梯度爆炸。
每一层的梯度依赖于上一层的输出值,
H2=H1*W2
对W2求导:△W2 = …*H1
如果H1很小,就会梯度消失;如果H1很大就会梯度爆炸
在pytorch中,有自己默认初始化参数方式,所以在你定义好网络结构以后,不进行参数初始化也是可以的。