1. 激活函数
激活函数是在人工神经网络的神经元上运行的函数,负责将神经元的输入映射到输出端。
常用的激活函数:
import torch
a = torch.linspace(-10, 10, 20)
# sigmoid函数 0~1
res = torch.sigmoid(a)
print(res)
# tanh函数 -1~1
res = torch.tanh(a)
print(res)
# ReLU函数 0~inf
res = torch.relu(a)
print(res)
输出:
tensor([4.5398e-05, 1.3006e-04, 3.7256e-04, 1.0667e-03, 3.0503e-03, 8.6901e-03,
2.4502e-02, 6.7134e-02, 1.7094e-01, 3.7138e-01, 6.2862e-01, 8.2906e-01,
9.3287e-01, 9.7550e-01, 9.9131e-01, 9.9695e-01, 9.9893e-01, 9.9963e-01,
9.9987e-01, 9.9995e-01])
tensor([-1.0000, -1.0000, -1.0000, -1.0000, -1.0000, -0.9998, -0.9987, -0.9897,
-0.9184, -0.4826, 0.4826, 0.9184, 0.9897, 0.9987, 0.9998, 1.0000,
1.0000, 1.0000, 1.0000, 1.0000])
tensor([ 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000,
0.0000, 0.0000, 0.5263, 1.5789, 2.6316, 3.6842, 4.7368, 5.7895,
6.8421, 7.8947, 8.9474, 10.0000])
2. loss的梯度
均方误差(MSE)是回归损失函数中常用的误差,它是预测值与目标值之间差值平方和的均值。
import torch
import torch.nn.functional as F
x = torch.ones(1)
w = torch.full([1], 2.)
w.requires_grad_()
# 均方误差
mse = F.mse_loss(torch.ones(1), x * w)
print(x, w, mse)
# 梯度
grad = torch.autograd.grad(mse, [w])
print(grad)
输出:
tensor([1.]) tensor([2.], requires_grad=True) tensor(1., grad_fn=
)
(tensor([2.]),)
import torch
import torch.nn.functional as F
a = torch.rand(3)
a.requires_grad_()
p = F.softmax(a, dim=0)
print(torch.autograd.grad(p[1], [a], retain_graph=True))
print(torch.autograd.grad(p[2], [a]))
输出:
(tensor([-0.1407, 0.2497, -0.1090]),)
(tensor([-0.0658, -0.1090, 0.1748]),)
3. 感知机的梯度推导
import torch
import torch.nn.functional as F
x = torch.randn(1, 10)
w = torch.randn(1, 10, requires_grad=True)
o = torch.sigmoid(x @ w.t())
print(o.shape)
loss = F.mse_loss(torch.ones(1, 1), o)
loss.backward()
print(w.grad)
输出:
torch.Size([1, 1])
tensor([[ 0.4680, -0.5402, 0.2481, -0.1499, -0.0996, -0.2685, -0.2663, 0.2420,
-0.3194, -0.1558]])
4. 自动求导
import torch
# requires_grad=True x可以计算梯度
x = torch.ones((2, 2), requires_grad=True)
print(x)
y = x + 2
print(y)
z = y * y * 3
print(z)
# out的结果是z的平均值 这里的out结果是与x相关的
out = z.mean()
print(out)
# backward 对out求导,计算out的梯度
out.backward()
# 输出x的梯度
print(x.grad)
print(y.grad)
输出:
tensor([[1., 1.],
[1., 1.]], requires_grad=True)
tensor([[3., 3.],
[3., 3.]], grad_fn=)
tensor([[27., 27.],
[27., 27.]], grad_fn=)
tensor(27., grad_fn=)
tensor([[4.5000, 4.5000],
[4.5000, 4.5000]])
None