参考链接: torch.Tensor.grad
grad
属性:grad
This attribute is None by default and becomes a Tensor the first time a call to
backward() computes gradients for self. The attribute will then contain the
gradients computed and future calls to backward() will accumulate (add)
gradients into it.
该参数默认情况下是None,但是当第一次为当前张量自身self计算梯度调用backward()方法时,
该属性grad将变成一个Tensor张量类型. 该属性将包含计算所得的梯度,在这之后如果再次调用
backward()方法,那么将会对这个grad属性进行累加.
代码实验展示:
Microsoft Windows [版本 10.0.18363.1316]
(c) 2019 Microsoft Corporation。保留所有权利。
C:\Users\chenxuqi>conda activate ssd4pytorch1_2_0
(ssd4pytorch1_2_0) C:\Users\chenxuqi>python
Python 3.7.7 (default, May 6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.manual_seed(seed=20200910)
<torch._C.Generator object at 0x000002B58817D330>
>>>
>>> a = torch.randn(3,5,requires_grad=True)
>>> a
tensor([[ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806],
[ 2.0937, 1.0406, -1.7651, 1.1216, 0.8440],
[ 0.1783, 0.6859, -1.5942, -0.2006, -0.4050]], requires_grad=True)
>>> a.grad
>>>
>>> b = a.sum()
>>> b
tensor(0.8781, grad_fn=<SumBackward0>)
>>> c = a.mean()
>>>
>>> b
tensor(0.8781, grad_fn=<SumBackward0>)
>>> c
tensor(0.0585, grad_fn=<MeanBackward0>)
>>> print(a.grad)
None
>>>
>>> b.backward()
>>> a.grad
tensor([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])
>>>
>>> # 不清零的话梯度将会累加
>>> c.backward()
>>> a.grad
tensor([[1.0667, 1.0667, 1.0667, 1.0667, 1.0667],
[1.0667, 1.0667, 1.0667, 1.0667, 1.0667],
[1.0667, 1.0667, 1.0667, 1.0667, 1.0667]])
>>>
>>> a.grad.zero_()
tensor([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])
>>> a.grad
tensor([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])
>>>
>>> c = a.mean()
>>> c.backward()
>>> a.grad
tensor([[0.0667, 0.0667, 0.0667, 0.0667, 0.0667],
[0.0667, 0.0667, 0.0667, 0.0667, 0.0667],
[0.0667, 0.0667, 0.0667, 0.0667, 0.0667]])
>>>
>>>
参考链接: PyTorch使用torch.sort()函数来筛选出前k个最大的项或者筛选出前k个最小的项
代码实验展示,反向传播,只传播到前K个最大的项:
Microsoft Windows [版本 10.0.18363.1316]
(c) 2019 Microsoft Corporation。保留所有权利。
C:\Users\chenxuqi>conda activate ssd4pytorch1_2_0
(ssd4pytorch1_2_0) C:\Users\chenxuqi>python
Python 3.7.7 (default, May 6 2020, 11:45:54) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> torch.manual_seed(seed=20200910)
<torch._C.Generator object at 0x000001F7F101D330>
>>>
>>> data = torch.randn(15,requires_grad=True)
>>> data
tensor([ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806, 2.0937, 1.0406, -1.7651,
1.1216, 0.8440, 0.1783, 0.6859, -1.5942, -0.2006, -0.4050],
requires_grad=True)
>>>
>>> # 筛选出前k个最大的数
>>> k = 7
>>> a, idx1 = torch.sort(data, descending=True)
>>> b, idx2 = torch.sort(idx1)
>>> a
tensor([ 2.0937, 1.1216, 1.0406, 0.9088, 0.8440, 0.6859, 0.2824, 0.1783,
-0.1806, -0.2006, -0.3715, -0.4050, -1.5942, -1.7601, -1.7651],
grad_fn=<SortBackward>)
>>> b
tensor([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
>>> idx1
tensor([ 5, 8, 6, 2, 9, 11, 0, 10, 4, 13, 1, 14, 12, 3, 7])
>>> idx2
tensor([ 6, 10, 3, 13, 8, 0, 2, 14, 1, 4, 7, 5, 12, 9, 11])
>>>
>>> a
tensor([ 2.0937, 1.1216, 1.0406, 0.9088, 0.8440, 0.6859, 0.2824, 0.1783,
-0.1806, -0.2006, -0.3715, -0.4050, -1.5942, -1.7601, -1.7651],
grad_fn=<SortBackward>)
>>> data
tensor([ 0.2824, -0.3715, 0.9088, -1.7601, -0.1806, 2.0937, 1.0406, -1.7651,
1.1216, 0.8440, 0.1783, 0.6859, -1.5942, -0.2006, -0.4050],
requires_grad=True)
>>> data[idx2<k]
tensor([0.2824, 0.9088, 2.0937, 1.0406, 1.1216, 0.8440, 0.6859],
grad_fn=<IndexBackward>)
>>> sum_topK = data[idx2<k].sum()
>>> sum_topK
tensor(6.9770, grad_fn=<SumBackward0>)
>>>
>>> 2.0937+1.1216+1.0406+0.9088+0.8440+0.6859+0.2824
6.977000000000001
>>> data.grad
>>> print(data.grad)
None
>>> sum_topK.backward()
>>> print(data.grad)
tensor([1., 0., 1., 0., 0., 1., 1., 0., 1., 1., 0., 1., 0., 0., 0.])
>>>
>>>
>>>