pytorch入门学习(五)-------autograd

torch.autograd.backward ------ 自动求取梯度

torch.autograd.backward(tensors, grad_tensors=None, retain_graph=None, create_graph=False)

tensors: 用于求导梯度, 如loss;
retain_graph: 保存计算图;
create_graph: 创建导数计算图, 用于高阶求导;
grad_tensors: 多梯度权重,当有多个loss需要计算的时候设置比例;

retain_graph
有两个以上输出的时候就需要用到这个参数需要retain_graph参数为True去保留中间参数从而两个loss的backward()不会相互影响。
如果设置为False,计算图中的中间变量在计算完后就会被释放。
但是在平时的使用中这个参数默认都为False从而提高效率,和creat_graph的值一样。

retain_graph (bool, optional) ----- If False, the graph used to compute the grad will be freed. Note that in nearly all cases setting this option to True is not needed and often can be worked around in a much more efficient way. Defaults to the value of create_graph.
create_graph (bool, optional) ----- If True, graph of the derivative will be constructed, allowing to compute higher order derivative products. Defaults to False.

torch.autograd.grad ------ 求取梯度

torch.autograd.grad(outputs, inputs, grad_outputs=None, retain_graph=None, create_graph=False)

outputs: 用于求导的张量, 如loss;
inputs: 需要梯度的张量;
create_graph: 创建导数计算图,用于高阶求导;
retain_graph: 保存计算图;
grad_outputs: 多梯度权重;

注意:

1. 梯度不会自动清零;
2. 依赖于叶子结点的结点,requires_grad 默认为 True;
3. 叶子结点不可执行 in-place

为什么叶子结点不能进行原位操作?

因为进行原位操作时地址不变,但数据发生变化,在计算图中,若要再次使用该数据,是通过地址来寻找的,若进行原位操作,地址对应的数据变了,会影响处理结果,所以不允许进行in-place操作。

以下是 pytorch 实例:

# torch.autograd.backward ------ 自动求取梯度
import torch
torch.manual_seed(10)
# ======= retain_graph
flag = False
if flag:
    w = torch.tensor([1.], requires_grad=True)
    x = torch.tensor([2.], requires_grad=True)

    a = torch.add(w, x)
    b = torch.add(w, 1)
    y = torch.mul(a, b)
#   m = torch.add(y, a)

    y.backward(retain_graph=True)        # 此时计算图不会被释放,以便下次循环

    y.backward()                         # 此处循环需要用到前面的计算图结构
    print(w.grad)

# ======= grad_tensors
flag = False
if flag:
    w = torch.tensor([1.], requires_grad=True)
    x = torch.tensor([2.], requires_grad=True)

    a = torch.add(w, x)     # retain_grad()
    b = torch.add(w, 1)

    y0 = torch.mul(a, b)    # y0 = (x+w) * (w+1)
    y1 = torch.add(a, b)    # y1 = (x+w) + (w+1)    dy1/dw = 2

    loss = torch.cat([y0, y1], dim=0)       # [y0, y1]
    grad_tensors = torch.tensor([1., 3.])   # 设置梯度权重

    loss.backward(gradient=grad_tensors)    # gradient 传入 torch.autograd.backward()中的grad_tensors

    print(w.grad)

# ======= torch.autograd.grad ------ 求取梯度
flag = False
if flag:

    x = torch.tensor([4.], requires_grad=True)
    y = torch.pow(x, 2)   # y = x**2

    # grad_1 = dy/dx = 2x = 2 * 4 = 8
    grad_1 = torch.autograd.grad(y, x, create_graph=True)
    print("x的一阶导数: ", grad_1)
    print(grad_1[0])
    # grad_2 = d(dy/dx)/dx = d(2x)/dx = 2
    grad_2 = torch.autograd.grad(grad_1[0], x)
    print("x的二阶导数: ", grad_2)

flag = False

if flag:

    w = torch.tensor([2.], requires_grad=True)
    x = torch.tensor([4.], requires_grad=True)

    for i in range(8):
        a = torch.add(w, x)
        b = torch.add(w, 1)
        y = torch.mul(a, b)

        y.backward()
        print(w.grad)
        # 需要手动对w的梯度清零,下划线操作表示原位操作,
        w.grad.zero_()

        # 依赖于叶子结点的结点, requires_grad默认为True,所以a, b, y的属性都是True
        print(a.requires_grad, b.requires_grad, y.requires_grad)

# 验证叶子结点处进行in-place操作后的地址变化
flag = False

if flag:

    a = torch.ones((1, ))
    print(id(a), a)

    # 操作前后地址发生改变
    # a = a + torch.ones((1, ))
    #  print(id(a), a)
    # 操作前后地址不变,
    a += torch.ones((1, ))
    print(id(a), a)


# 说明叶子结点不能进行原位操作
flag = True

# flag = False
if flag:

    w = torch.tensor([1.], requires_grad=True)
    x = torch.tensor([2.], requires_grad=True)

    a = torch.add(w, x)
    b = torch.add(w, 1)
    y = torch.mul(a, b)

    w.add_(1)

    y.backward()


你可能感兴趣的:(pytorch,python,pytorch,pycharm)