昨天微信‘机器之心’发布了开源软件pytorch出场的重磅消息。看内容感觉动态计算图的思路比较新颖。在初步体验其自动微分操作之后,发觉这个算法包确实是将计算图的符号占位变量和实体变量的绑定操作合二为一了。这就比其它的算法包来得更为简洁明快,学习的曲线陡降了很多。
CentOS 6.8下,建议按官网说明使用anaconda环境。关于cuda的安装和使用,按官网说明操作也很方便。官网http://pytorch.org/docsAPI的说明很详细,其上有很多具体的例子。
下面是几个自动求导的示例:
第一类:纯标量
import torch
from torch.autograd import Variable
x = Variable(torch.ones(1)*3, requires_grad=True)
y = Variable(torch.ones(1)*4, requires_grad=True)
z = x.pow(2)+3*y.pow(2) # z = x^2+3y^2, dz/dx=2x, dz/dy=6y
z.backward() #纯标量结果可不写占位变量
print x.grad # x = 3 时, dz/dx=2x=2*3=6
print y.grad # y = 4 时, dz/dy=6y=6*4=24
结果:
Variable containing:
6
[torch.FloatTensor of size 1]
Variable containing:
24
[torch.FloatTensor of size 1]
第二类:全 1 向量
x = Variable(torch.ones(2)*3, requires_grad=True)
y = Variable(torch.ones(2)*4, requires_grad=True)
z = x.pow(2)+3*y.pow(2)
z.backward(torch.ones(2))
print x.grad
print y.grad
结果:
Variable containing:
6
6
[torch.FloatTensor of size 2]
Variable containing:
24
24
[torch.FloatTensor of size 2]
第三类:异值向量
x = Variable(torch.Tensor([1,2,3]), requires_grad=True)
y = Variable(torch.Tensor([4,5,6]), requires_grad=True)
z = x.pow(2)+3*y.pow(2)
z.backward(torch.ones(3))
print x.grad
print y.grad
结果:
Variable containing:
2
4
6
[torch.FloatTensor of size 3]
Variable containing:
24
30
36
[torch.FloatTensor of size 3]
第四类:矩阵乘法
x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
y = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
z = x.mm(y.t())
z.backward(torch.ones(2,2))
print x.grad
print y.grad
结果:
Variable containing:
5 7 9
5 7 9
[torch.FloatTensor of size 2x3]
Variable containing:
5 7 9
5 7 9
[torch.FloatTensor of size 2x3]
x = Variable(torch.Tensor([[1,2,3],[4,5,6]]), requires_grad=True)
y = Variable(torch.Tensor([1,3,5]), requires_grad=True) #单个方括号,否则视为矩阵,不能用mv函数
z = x.mv(y)
z.backward(torch.ones(2))
print x.grad
print y.grad
结果:
Variable containing:
1 3 5
1 3 5
[torch.FloatTensor of size 2x3]
Variable containing:
5
7
9
[torch.FloatTensor of size 3]
第六类:cuda. 第一次执行时似乎需要编译,会执行很长时间,大概需要3~5分钟,之后的运行会很快。
import torch
from torch.autograd import Variable
print ' -------- cuda scalar --------'
x = Variable((torch.ones(1)*3).cuda(), requires_grad=True)
y = Variable((torch.ones(1)*4).cuda(), requires_grad=True)
z = (x.pow(2)+3*y.pow(2))
print 'z without cuda: {0}'.format(type(z))
z = (x.pow(2)+3*y.pow(2))
z.backward()
print '\nx.type: {0}, x.grad: {1}'.format(type(x),x.grad)
print 'y.type: {0}, y.grad: {1}'.format(type(y),y.grad)
print 'z.type: {0}, z: {1}'.format(type(z),z)
print ' -------- cuda matrix --------'
x = Variable(torch.Tensor([[1, 2, 3], [4, 5, 6]]).cuda(), requires_grad=True)
y = Variable(torch.Tensor([[1, 2, 3], [4, 5, 6]]).cuda(), requires_grad=True)
z = x.mm(y.t())#.cuda(0,async=True)
z.backward(torch.ones(2, 2).cuda(0,async=True))
print '\nx.type: {0}, x.grad: {1}'.format(type(x),x.grad)
print 'y.type: {0}, y.grad: {1}'.format(type(y),y.grad)
print 'z.type: {0}, z: {1}'.format(type(z),z)
结果:
-------- cuda scalar --------
z without cuda:
x.type: , x.grad: Variable containing:
6
[torch.cuda.FloatTensor of size 1 (GPU 0)]
y.type: , y.grad: Variable containing:
24
[torch.cuda.FloatTensor of size 1 (GPU 0)]
z.type: , z: Variable containing:
57
[torch.cuda.FloatTensor of size 1 (GPU 0)]
-------- cuda matrix --------
x.type: , x.grad: Variable containing:
5 7 9
5 7 9
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]
y.type: , y.grad: Variable containing:
5 7 9
5 7 9
[torch.cuda.FloatTensor of size 2x3 (GPU 0)]
z.type: , z: Variable containing:
14 32
32 77
[torch.cuda.FloatTensor of size 2x2 (GPU 0)]