pytorch变量类型可以分成三大类,cpu,gpu,Variable。分别表示数据在cpu上参与计算,数据在gpu上参与计算,已经数据加入到梯度计算图中。三者转换方法也很简单:
cpu转gpu使用t.cuda()
gpu转cpu使用t.cpu()
cpu,gpu转variable使用Variable(t)
Variable转cpu,gpu使用v.data
tensor转numpy使用t.numpy()
numpy转tensor使用torch.from_numpy()
注意y = Variable(t.cuda())生成一个节点y,y = Variable(t).cuda(),生成两个计算图节点t和y
将[1]转成1,单元素tensor转成scalar变量,类型不变。single element tensor to scalar
a = [1]
a[0]
# detach_()将计算图中节点转为叶子节点,也就是将节点.grad_fn设置为none,这样detach_()的前一个节点就不会再与当前变量连接
>>> import torch
>>> from torch.autograd import Variable
>>>
>>> x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> y = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> m = 1*x
>>> m.detach_()
>>> n = y.pow(3)
>>> z.backward(torch.ones(2,3))
>>> print(x.grad) #没有后续变量与x连接,x与m是断开的
None
>>> print(y.grad)
Variable containing:
1.8000e+01 5.7600e+02 4.3740e+03
1.8432e+04 5.6250e+04 1.3997e+05
[torch.FloatTensor of size 2x3]
由于pytorch是动态编程,detach使用位置不同,效果也不一样。
import torch
from torch.autograd import Variable
a = Variable(torch.randn(2, 2), requires_grad=True)
b = a * 2
c = b * 2
b.detach_()
c.sum().backward()
print(a.grad, b.grad, c.grad)
Variable containing:
4 4
4 4
[torch.FloatTensor of size 2x2]
None None
import torch
from torch.autograd import Variable
a = Variable(torch.randn(2, 2), requires_grad=True)
b = a * 2
b.detach_()
c = b * 2
c.sum().backward()
print(a.grad, b.grad, c.grad)
#报错: element 0 of variables does not require grad and does not have a grad_fn
import torch
from torch.autograd import Variable
a = Variable(torch.randn(2, 2), requires_grad=True)
b = a * 2
d = a * 3
temp = b.detach()
c = temp * 2 + d
c.sum().backward()
print(a.grad, b.grad, c.grad, d.grad)
Variable containing:
3 3
3 3
[torch.FloatTensor of size 2x2]
None None None
注意如果使用detach_(),则虽然新分离出Variable,但其指向的tensor还是同一个修改的话,会产生影响。
import torch
from torch.nn import init
from torch.autograd import Variable
t1 = torch.FloatTensor([1., 2.])
v1 = Variable(t1)
t2 = torch.FloatTensor([2., 3.])
v2 = Variable(t2)
v3 = v1 + v2
v3_detached = v3.detach()
v3_detached.data.add_(t1) # 修改了 v3_detached Variable中 tensor 的值
print(v3, v3_detached) # v3 中tensor 的值也会改变
# 如果对tensor采用直接根据索引赋值,这些元素也将不在参与梯度计算
>>> import torch
>>> from torch.autograd import Variable
>>> x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> y = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> m = 1*x
>>> m[(m>4).detach()] = 0
>>> print(m) #m中值5 6被直接赋值为0
Variable containing:
1 2 3
4 0 0
[torch.FloatTensor of size 2x3]
>>> n = y.pow(3)
>>> z = m.pow(2)+3*n.pow(2)
>>> z.backward(torch.ones(2,3))
>>> print(x.grad) # x的梯度不再包含值5 6的梯度
Variable containing:
2 4 6
8 0 0
[torch.FloatTensor of size 2x3]
# requires_grad=False 用于控制是否对leaf variable求导
>>> import torch
>>> from torch.autograd import Variable
>>>
>>> x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=False)
>>> y = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> m = x.pow(2)
>>> n = y.pow(3)
>>> z = m.pow(2)+3*n.pow(2)
>>> z.backward(torch.ones(2,3))
>>> print(m.requires_grad)
False
>>> print(z.requires_grad)
True
>>> print(x.grad)
None
>>> print(y.grad)
Variable containing:
1.8000e+01 5.7600e+02 4.3740e+03
1.8432e+04 5.6250e+04 1.3997e+05
[torch.FloatTensor of size 2x3]
# requires_grad 只能用于 leaf variables
>>> import torch
>>> from torch.autograd import Variable
>>>
>>> x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
z.backward(torch.ones(2,3))
print(m.requires_grad)
print(x.grad)
print(y.grad)>>> y = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
>>> m = x.pow(2)
>>> m.requires_grad = False
Traceback (most recent call last):
File "", line 1, in
RuntimeError: you can only change requires_grad flags of leaf variables. If you want to use a computed variable in a subgraph that doesn't require differentiation use var_no_grad = var.detach().
>>> n = y.pow(3)
>>> z = m.pow(2)+3*n.pow(2)
>>> z.backward(torch.ones(2,3))
>>> print(m.requires_grad)
True
>>> print(x.grad)
Variable containing:
4 32 108
256 500 864
[torch.FloatTensor of size 2x3]
# 一个二维张量与一个一维张量相除结果
Variable containing:
0.0000 0.2447 0.0000
0.0000 0.2447 0.0000
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]
Variable containing:
0.0010
2.0010
0.0010
[torch.cuda.FloatTensor of size 3 (GPU 0)]
Variable containing:
0.0000 0.1223 0.0000
0.0000 0.1223 0.0000
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]
# 一个二维张量与一个一维张量相除结果
Variable containing:
0.0000 0.2447 0.0000
0.0000 0.2447 0.0000
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]
Variable containing:
0.0010 2.0010 0.0010
[torch.cuda.FloatTensor of size 1x3 (GPU 0)]
Variable containing:
0.0000 0.1223 0.0000
0.0000 0.1223 0.0000
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]
d = c.view(-1,1).sum(1)的排序方式
Variable containing:
(0 ,0 ,.,.) =
2.5000 6.5000
4.5000 4.0000
(0 ,1 ,.,.) =
2.5000 6.5000
4.5000 4.0000
(1 ,0 ,.,.) =
4.5000 6.5000
6.5000 6.5000
(1 ,1 ,.,.) =
4.5000 6.5000
6.5000 6.5000
[torch.cuda.FloatTensor of size 2x2x2x2 (GPU 0)]
Variable containing:
2.5000
6.5000
4.5000
4.0000
2.5000
6.5000
4.5000
4.0000
4.5000
6.5000
6.5000
6.5000
4.5000
6.5000
6.5000
6.5000
非leaf Variable不存储grad
xx = Variable(torch.randn(1,1), requires_grad = True)
yy = 3*xx
zz = yy**2
zz.backward()
xx.grad # 0.5137
yy.grad # None
zz.grad # None
注意:
a = Variable(torch.randn(2,10), requires_grad=True).cuda()
a不是leaf,Variable是leaf,但是使用cuda后生成另一个非leaf Variable。
如下才是leaf:
a = Variable(torch.randn(2,10).cuda(), requires_grad=True)
如果你想获得非leaf Variable的grad需要注入hook:
yGrad = torch.zeros(1,1)
def extract(xVar):
global yGrad
yGrad = xVar
xx = Variable(torch.randn(1,1), requires_grad = True)
yy = 3*xx
zz = yy**2
yy.register_hook(extract)
#### Run the backprop:
print (yGrad) # Shows 0.
zz.backward()
print (yGrad) # Show the correct dzdy