Pytorch的Broadcast,合并与分割,数学运算,属性统计以及高阶操作! |
这里: Broadcast它能维度扩展和expand一样,它是自动扩展,并且不需要拷贝数据,能够节省内存。关键思想:
Broadcast存在的意义: ①实际的扩展。②节省内存资源。当没有维度的时候,首先添加一个size=1的维度,然后对size=1的所有维度进行扩展。
import torch
a = torch.rand(4, 32, 14, 14)
b = torch.rand(1, 32, 1, 1)
c = torch.rand(32, 1, 1)
# b [1, 32, 1, 1]=>[4, 32, 14, 14]
print((a + b).shape)
print((a+c).shape)
torch.Size([4, 32, 14, 14])
torch.Size([4, 32, 14, 14])
Process finished with exit code 0
import torch
# 两个班级a和b,各有32个学生,8门成绩。
a = torch.rand(4, 32, 8)
b = torch.rand(5, 32, 8)
# 按照班级进行合并起来。
print(torch.cat([a, b], dim=0).shape)
torch.Size([9, 32, 8])
Process finished with exit code 0
import torch
a1 = torch.rand(4, 3, 32, 32)
a2 = torch.rand(5, 3, 32, 32)
print(torch.cat([a1, a2], dim=0).shape)
print('====================================')
a3 = torch.rand(4, 1, 32, 32)
# print(torch.cat([a1, a3], dim=0)) # 这句报错。
print(torch.cat([a1, a3], dim=1).shape)
torch.Size([9, 3, 32, 32])
====================================
torch.Size([4, 4, 32, 32])
Process finished with exit code 0
import torch
a1 = torch.rand(4, 3, 32, 32)
a2 = torch.rand(4, 3, 32, 32)
print(torch.cat([a1, a2], dim=1).shape)
print('====================================')
print(torch.stack([a1, a2], dim=1).shape) # 各自创建一个新的维度。然后concat
a = torch.rand(32, 8)
b = torch.rand(32, 8)
print(torch.stack([a, b], dim=0).shape)
torch.Size([4, 6, 32, 32])
====================================
torch.Size([4, 2, 3, 32, 32]) # 各自创建一个新的维度。然后concat
torch.Size([2, 32, 8])
Process finished with exit code 0
这里: 具体的应用比如有a,b两个班级各有60个学生,8门成绩。维度表示为[60, 8],现在把这2个班级的成绩和成一张表。如果cat起来为[120, 8],显然不合适。用stack合并起来变为[2, 60, 8]显然合适。这里也说明stack操作两个维度必须一致。
这里: .split(长度,dim)第一参数表示拆分后的长度,第二个参数表示要拆分的维度。
import torch
c = torch.rand(2, 32, 8)
aa, bb = c.split([1, 1], dim=0)
print(aa.shape)
print(bb.shape)
print('====================================')
aa, bb = c.split(1, dim=0)
print(aa.shape)
print(bb.shape)
torch.Size([1, 32, 8])
torch.Size([1, 32, 8])
====================================
torch.Size([1, 32, 8])
torch.Size([1, 32, 8])
Process finished with exit code 0
这里: .chunk(数量,dim)第一参数表示要拆分后的数量,第二个参数表示要拆分的维度。
import torch
c = torch.rand(8, 32, 8)
aa, bb = c.chunk(2, dim=0) # 第1个参数要拆分后的数量
print(aa.shape)
print(bb.shape)
torch.Size([4, 32, 8])
torch.Size([4, 32, 8])
Process finished with exit code 0
import torch
a = torch.rand(3, 4)
b = torch.rand(4)
print(a+b)
print(torch.add(a, b))
print(torch.all(torch.eq(a-b, torch.sub(a, b))))
print(torch.all(torch.eq(a*b, torch.mul(a, b))))
print(torch.all(torch.eq(a/b, torch.div(a, b))))
tensor([[0.5039, 1.2329, 1.4820, 0.7634],
[1.1962, 1.2740, 1.1871, 0.6491],
[1.0346, 1.0578, 0.9915, 0.7993]])
tensor([[0.5039, 1.2329, 1.4820, 0.7634],
[1.1962, 1.2740, 1.1871, 0.6491],
[1.0346, 1.0578, 0.9915, 0.7993]])
tensor(True)
tensor(True)
tensor(True)
Process finished with exit code 0
import torch
a = torch.tensor([[3., 3.], [3., 3.]])
b = torch.ones(2, 2)
print(torch.mm(a, b))
print(torch.matmul(a, b))
print(a@b)
tensor([[6., 6.],
[6., 6.]])
tensor([[6., 6.],
[6., 6.]])
tensor([[6., 6.],
[6., 6.]])
Process finished with exit code 0
这里: 上面的相乘是针对2D的tensor,那么对于3D和4D的tensor如何mat呢? 神经网络中的图片一般都是2D的,NLP中的文本一般都是3D和4D的。如何定义这些矩阵相乘呢?下面的例子展示。
import torch
a = torch.rand(4, 3, 28, 64)
b = torch.rand(4, 3, 64, 32)
# 这就是4D的tensor矩阵相乘,这种规则是符合实际规则的。
# 这其实就是支持多个矩阵对并行相乘。
# 只取低的维度(右边)参与运算,就是[28,64]@[64,32]
print(torch.matmul(a, b).shape)
print('================================')
c = torch.rand(4, 1, 64, 32) # 这里使用使用broadcasting机制,把dim的size为1的变为一致。
print(torch.matmul(a, c).shape)
torch.Size([4, 3, 28, 32])
================================
torch.Size([4, 3, 28, 32])
Process finished with exit code 0
这里: pow(tensor, 次方)第一个参数为Tensor,第二个参数表示次方,比如2次方,三次方,四次方等等。
import torch
a =torch.full([2, 2], 3) # 使用torch.full函数创建一个shape[2, 2],元素全部为3的张量
print(a.pow(2))
print(torch.pow(a, 2))
print(a**2)
print('=============================')
b = a**2
print(b.sqrt())
print(b.rsqrt()) # 平方根的导数
tensor([[9., 9.],
[9., 9.]])
tensor([[9., 9.],
[9., 9.]])
tensor([[9., 9.],
[9., 9.]])
=============================
tensor([[3., 3.],
[3., 3.]])
tensor([[0.3333, 0.3333],
[0.3333, 0.3333]])
Process finished with exit code 0
import torch
a = torch.exp(torch.ones(2, 2))
print(a)
print(torch.log(a)) # 默认以e为底,使用2为底或者其他的,自己设置.
tensor([[2.7183, 2.7183],
[2.7183, 2.7183]])
tensor([[1., 1.],
[1., 1.]])
Process finished with exit code 0
import torch
a = torch.tensor(3.14)
# .floor()向下取整,.ceil()向上取整,.trunc()截取整数,.frac截取小数。
print(a.floor(), a.ceil(), a.trunc(), a.frac())
print(a.round())
b = torch.tensor(3.5)
print(b.round())
tensor(3.) tensor(4.) tensor(3.) tensor(0.1400)
tensor(3.)
tensor(4.)
Process finished with exit code 0
这里: 主要用在梯度裁剪里面,梯度离散(不需要从网络层面解决,因为梯度非常小,接近0)和梯度爆炸(梯度非常大,100已经算是大的了)。因此在网络训练不稳定的时候,可以打印一下梯度的模看看,
w.grad.norm(2)
表示梯度的二范数(一般100,1000已经算是大的了,一般10以内算是合适的)。
a.clamp(min):
表示tensor a中小于10的都赋值为10,表示最小值为10;
import torch
grad = torch.rand(2, 3)*15
print(grad)
print(grad.max(), grad.median(), grad.min())
print('============================================')
print(grad.clamp(10)) # 最小值限定为10,小于10的都变为10;
print(grad.clamp(8, 15))
print(torch.clamp(grad, 8, 15))
tensor([[11.0328, 4.9081, 2.3248],
[11.3747, 3.9017, 11.5049]])
tensor(11.5049) tensor(4.9081) tensor(2.3248)
============================================
tensor([[11.0328, 10.0000, 10.0000],
[11.3747, 10.0000, 11.5049]])
tensor([[11.0328, 8.0000, 8.0000],
[11.3747, 8.0000, 11.5049]])
tensor([[11.0328, 8.0000, 8.0000],
[11.3747, 8.0000, 11.5049]])
Process finished with exit code 0
这里: 先参考一下我之前的博客,向量范数和矩阵范数的定义。2.2. 向量范数矩阵范数
import torch
a = torch.full([8], 1)
b = a.view(2, 4)
c = a.view(2, 2, 2)
print(a, '\n', b,'\n', c)
print('=============================================')
print(a.norm(1), b.norm(1), c.norm(1))
print(a.norm(2), b.norm(2), c.norm(2))
print('=============================================')
print(b.norm(1, dim=1))
print(b.norm(2, dim=1))
print('=============================================')
print(c.norm(1, dim=0))
print(c.norm(2, dim=0))
print(torch.norm(c, p=2, dim=0)) # 同一个表达,p=2可以省略,默认就是2
tensor([1., 1., 1., 1., 1., 1., 1., 1.])
tensor([[1., 1., 1., 1.],
[1., 1., 1., 1.]])
tensor([[[1., 1.],
[1., 1.]],
[[1., 1.],
[1., 1.]]])
=============================================
tensor(8.) tensor(8.) tensor(8.)
tensor(2.8284) tensor(2.8284) tensor(2.8284)
=============================================
tensor([4., 4.])
tensor([2., 2.])
=============================================
tensor([[2., 2.],
[2., 2.]])
tensor([[1.4142, 1.4142],
[1.4142, 1.4142]])
tensor([[1.4142, 1.4142],
[1.4142, 1.4142]])
Process finished with exit code 0
import torch
a = torch.rand(2, 4)
print(a)
print(a.max(), a.min(), a.mean())
print(a.prod()) # 最大值,最小值,均值,prod表示累乘也就是阶乘。
print(a.sum()) # 累加操作。
print(a.argmax(), a.argmin())
tensor([[0.4677, 0.8331, 0.4240, 0.9348],
[0.0192, 0.2354, 0.9979, 0.0077]])
tensor(0.9979) tensor(0.0077) tensor(0.4900)
tensor(5.3340e-06)
tensor(3.9197)
tensor(6) tensor(7)
Process finished with exit code 0
这里: 从结果中我们可以发现,min/max/argmin/argmax这些函数首先把Tensor打平成一维的Tensor,因此上面的argmin/argmax才会得到那样的结果。
import torch
a = torch.rand(4, 5)
print(a)
print(a.max(dim=1)) # 得到的shape为:[4], [4]
print('==============================')
# keepdim=True就是维度保持一致。
print(a.max(dim=1, keepdim=True)) # 有时候为了shape还为:[4,1], [4,1]
tensor([[0.0956, 0.1968, 0.2054, 0.3631, 0.5661],
[0.8228, 0.9709, 0.1276, 0.2207, 0.5825],
[0.7764, 0.2675, 0.1439, 0.3109, 0.6960],
[0.7047, 0.5668, 0.3775, 0.6214, 0.0674]])
torch.return_types.max(
values=tensor([0.5661, 0.9709, 0.7764, 0.7047]),
indices=tensor([4, 1, 0, 0]))
==============================
torch.return_types.max(
values=tensor([[0.5661],
[0.9709],
[0.7764],
[0.7047]]),
indices=tensor([[4],
[1],
[0],
[0]]))
Process finished with exit code 0
这里:
topk(3, dim=1)
(最大的3个)返回结果如下图所示,如果把largest设置为False就是默认最小的几个。
这里:kthvalue(k,dim=1)
表示第k小的(默认表示小的)。下面图中的一共10中可能,第8小就是表示第3大。
import torch
a = torch.rand(5, 10)
print(a.topk(3, dim=1)) # 最大的3个元素,和对应的index
print('==========================================================')
print(a.topk(3, dim=1, largest=False)) # 最小的3个元素,和对应的index
print('==========================================================')
print(a.kthvalue(3))
print(a.kthvalue(3,dim=1))
torch.return_types.topk(
values=tensor([[0.9644, 0.8750, 0.8059],
[0.9445, 0.9039, 0.8314],
[0.9025, 0.8567, 0.8550],
[0.9710, 0.8377, 0.8066],
[0.8984, 0.8439, 0.8386]]),
indices=tensor([[5, 7, 2],
[3, 6, 4],
[1, 5, 0],
[3, 9, 6],
[3, 8, 2]]))
==========================================================
torch.return_types.topk(
values=tensor([[0.0790, 0.2262, 0.3413],
[0.1071, 0.1207, 0.1217],
[0.2904, 0.3274, 0.3424],
[0.1910, 0.2919, 0.5602],
[0.2474, 0.2730, 0.6032]]),
indices=tensor([[6, 8, 3],
[2, 5, 0],
[2, 4, 3],
[4, 2, 7],
[4, 0, 7]]))
==========================================================
torch.return_types.kthvalue(
values=tensor([0.3413, 0.1217, 0.3424, 0.5602, 0.6032]),
indices=tensor([3, 0, 3, 7, 7]))
torch.return_types.kthvalue(
values=tensor([0.3413, 0.1217, 0.3424, 0.5602, 0.6032]),
indices=tensor([3, 0, 3, 7, 7]))
Process finished with exit code 0
greater than
表示大于等于。equal表示等于eq。import torch
a = torch.rand(5, 5)
print(a>0.2)
print(torch.gt(a, 0.2))
print(a!=0)
tensor([[False, True, True, True, True],
[ True, True, True, True, True],
[ True, True, True, True, False],
[ True, True, False, True, True],
[ True, True, True, True, True]])
tensor([[False, True, True, True, True],
[ True, True, True, True, True],
[ True, True, True, True, False],
[ True, True, False, True, True],
[ True, True, True, True, True]])
tensor([[True, True, True, True, True],
[True, True, True, True, True],
[True, True, True, True, True],
[True, True, True, True, True],
[True, True, True, True, True]])
Process finished with exit code 0