在很多神经网络中,往往会出现多个层共享一个权重的情况,pytorch可以快速地处理权重共享问题。
例子1:
class ConvNet(nn.Module):
def init(self):
super(ConvNet, self).init()
self.conv_weight = nn.Parameter(torch.randn(3, 3, 5, 5))
def forward(self, x):
x = nn.functional.conv2d(x, self.conv_weight, bias=None, stride=1, padding=2, dilation=1, groups=1)
x = nn.functional.conv2d(x, self.conv_weight.transpose(2, 3).contiguous(), bias=None, stride=1, padding=0, dilation=1,
groups=1)
return x
上边这段程序定义了两个卷积层,这两个卷积层共享一个权重conv_weight,第一个卷积层的权重是conv_weight本身,第二个卷积层是conv_weight的转置。注意在gpu上运行时,transpose()后边必须加上.contiguous()使转置操作连续化,否则会报错。
例子2:
class LinearNet(nn.Module):
def init(self):
super(LinearNet, self).init()
self.linear_weight = nn.Parameter(torch.randn(3, 3))
def forward(self, x):
x = nn.functional.linear(x, self.linear_weight)
x = nn.functional.linear(x, self.linear_weight.t())
return x
这个网络实现了一个双层感知器,权重同样是一个parameter的本身及其转置。
例子3:
class LinearNet2(nn.Module):
def init(self):
super(LinearNet2, self).init()
self.w = nn.Parameter(torch.FloatTensor([[1.1,0,0], [0,1,0], [0,0,1]]))
def forward(self, x):
x = x.mm(self.w)
x = x.mm(self.w.t())
return x
作者:马管子
来源:CSDN
原文:https://blog.csdn.net/qq_19672579/article/details/79373985
版权声明:本文为博主原创文章,转载请附上博文链接!
import random
import torch
from torch.autograd import Variable
class DynamicNet(torch.nn.Module):
def init(self, D_in, H, D_out):
“”"
In the constructor we construct three nn.Linear instances that we will use
in the forward pass.
“”"
super(DynamicNet, self).init()
self.input_linear = torch.nn.Linear(D_in, H)
self.middle_linear = torch.nn.Linear(H, H)
self.output_linear = torch.nn.Linear(H, D_out)
def forward(self, x):
"""
For the forward pass of the model, we randomly choose either 0, 1, 2, or 3
and reuse the middle_linear Module that many times to compute hidden layer
representations.
Since each forward pass builds a dynamic computation graph, we can use normal
Python control-flow operators like loops or conditional statements when
defining the forward pass of the model.
Here we also see that it is perfectly safe to reuse the same Module many
times when defining a computational graph. This is a big improvement from Lua
Torch, where each Module could be used only once.
"""
h_relu = self.input_linear(x).clamp(min=0)
for _ in range(random.randint(0, 3)):
#这里重复利用Middle linear,权重共享,比tensorflow更方便哎
h_relu = self.middle_linear(h_relu).clamp(min=0)
y_pred = self.output_linear(h_relu)
return y_pred
N, D_in, H, D_out = 64, 1000, 100, 10
x = Variable(torch.randn(N, D_in))
y = Variable(torch.randn(N, D_out), requires_grad=False)
model = DynamicNet(D_in, H, D_out)
criterion = torch.nn.MSELoss(size_average=False)
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4, momentum=0.9)
for t in range(500):
# Forward pass: Compute predicted y by passing x to the model
y_pred = model(x)
# Compute and print loss
loss = criterion(y_pred, y)
print(t, loss.data[0])
# Zero gradients, perform a backward pass, and update the weights.
optimizer.zero_grad()
loss.backward()
optimizer.step()
作者:HxShine
来源:CSDN
原文:https://blog.csdn.net/qq_16949707/article/details/72626448
版权声明:本文为博主原创文章,转载请附上博文链接!