from https://www.jianshu.com/p/96a687ecbac4
grad
该属性值默认为None,在第一次调用backward()方法后,该值会附上一个数值,并且grad属性值会在之后的每次调用backward()方法时进行累计,这就是为什么我们在训练网络时,每次迭代计算backward()之前需要进行zero_grad的操作的原因。
requires_grad
若当前tensor需要被计算梯度,则该属性值需要为True,否则该属性值为False。但是需要注意的是tensor是否需要被计算梯度和tensor的grad属性是否有值并不是等价的,具体可以看如下is_leaf这个属性的介绍。即有的tensor可能requires_grad为True,但是他是叶子结点,则grad属性没有值。
backward()
计算当前tensor关于图中叶子节点的梯度值,需要注意的是上述提到过的该梯度值是会累计到对应的叶子节点中的。
register_hook()
该方法在backward()方法之前调用,通过这个钩子可以在backward()方法的执行中,人为的改变grad属性的值。如每一次都将原本计算得到的梯度乘以2。
detach()/detach_()
detach()方法把计算图中的某个值分离出来,并重新生成一个tensor。
detach_()方法将某个值从计算图中分离出来,并生成叶子节点。
is_leaf
叶子节点的特性:
在调用tensor的backward()方法时,只有叶子节点的grad属性会被赋值。如果要对非叶子节点的grad属性进行赋值,则需要在使用backward()方法之前调用retain_grad()方法。
哪些节点被称为叶子节点:
所有requires_grad属性值为false的tensor都被称为叶子节点。当某个tensor的requires_grad属性为True时,当且仅当该tensor是被用户直接创建,没有其他任何操作作用于该tensor的情况下,该节点也被称为叶子节点。中间结点通常不是叶子结点。如下面程序中的output。
All Tensors that have requires_grad which is False will be leaf Tensors by convention.
For Tensors that have requires_grad which is True, they will be leaf Tensors if they were created by the user. This means that they are not the result of an operation and so grad_fn is None.
Only leaf Tensors will have their grad populated during a call to backward(). To get grad populated for non-leaf Tensors, you can use retain_grad().
示例:
import torch
seed = 42
torch.manual_seed(seed)
def test_not_leaf_get_grad(t):
# retain_grad():Enables .grad attribute for non-leaf Tensors.
t.retain_grad()
def print_isleaf(t):
print('*' * 5)
print('{} is leaf {}!'.format(t, t.is_leaf))
def print_grad(t):
print('{} grad is {}'.format(t, t.grad))
def double_grad(grad):
grad = grad * 2
return grad
if __name__ == "__main__":
# init
input_ = torch.randn(1, requires_grad=True)
print_isleaf(input_) # tensor([0.3367], requires_grad=True) is leaf True!
print_isleaf(torch.randn(1, requires_grad=False)) # tensor([0.1288]) is leaf True!
output = input_ * input_
print_isleaf(output) # tensor([0.1134], grad_fn=) is leaf False!
output2 = output * 2
# test retain_grad()
# 使得非叶子结点具有梯度
test_not_leaf_get_grad(output)
# test register_hook()
# The hook will be called every time a gradient with respect to the Tensor is computed.
output2.register_hook(double_grad)
output2.backward()
print(output2.requires_grad) # True
print_isleaf(output2) # tensor([0.2267], grad_fn=) is leaf False!
print_grad(output2) # tensor([0.2267], grad_fn=) grad is None
# test leaf
print_isleaf(input_) # tensor([0.3367], requires_grad=True) is leaf True!
print_grad(input_) # tensor([0.3367], requires_grad=True) grad is tensor([2.6935])
# test no leaf
print_isleaf(output) # tensor([0.1134], grad_fn=) is leaf False!
print(output.requires_grad) # True
print_grad(output) # tensor([0.1134], grad_fn=) grad is tensor([4.])
# if __name__ == "__main__":
# # init
# input_ = torch.randn(1, requires_grad=True)
# output = input_ * input_
# output2 = output * 2
#
# # test register_hook()
# # The hook will be called every time a gradient with respect to the Tensor is computed.
# output2.register_hook(double_grad)
#
# output2.backward()
#
# # test leaf
# print_isleaf(input_) # tensor([0.3367], requires_grad=True) is leaf True!
#
# print_grad(input_) # tensor([0.3367], requires_grad=True) grad is tensor([2.6935])
#
# # test no leaf
# print_isleaf(output) # tensor([0.1134], grad_fn=) is leaf False!
# print(output.requires_grad) # True
# print_grad(output) # tensor([0.1134], grad_fn=) grad is None
# if __name__ == "__main__":
# # init
# input_ = torch.randn(1, requires_grad=False)
# w1 = torch.tensor([10.], requires_grad=True)
# output = input_ * w1
# output2 = output * 2
#
# # test register_hook()
# # The hook will be called every time a gradient with respect to the Tensor is computed.
# output2.register_hook(double_grad)
#
# output2.backward()
#
# # test leaf
# print_isleaf(input_) # tensor([0.3367]) is leaf True!
#
# print_grad(input_) # tensor([0.3367]) grad is None
#
# # test no leaf
# print_isleaf(output) # tensor([3.3669], grad_fn=) is leaf False!
# print(output.requires_grad) # True
# print_grad(output) # tensor([3.3669], grad_fn=) grad is None
# print_isleaf(w1) # tensor([10.], requires_grad=True) is leaf True!
# print_grad(w1) # tensor([10.], requires_grad=True) grad is tensor([1.3468])