本文主要包括:
神经网络主要实现分类以及回归预测两类问题,对于回归问题,主要讲述均方损失函数,而对于一些回归问题,需要根据特殊情况自定义损失函数。对于分类,主要讲述二分类交叉熵和多分类交叉熵函数
在讲解函数之前,提前说一下:
# 1. 创建损失函数对象,并指定返回结果,默认为:平均值 以MSE为例
criterion = MSELoss(reduction='...')
# 2. 定义input x, traget y
x = torch.tensor(...)
y = torch.tensor(...)
# 计算损失函数
loss = criterion(x, y)
size_average
和reduce
已被舍弃,使用reduction
参数控制损失函数的输出行为。class torch.nn.MSELoss(size_average=None, reduce=None, reduction='elementwise_mean')
计算输入x和标签y,n个元素平均平方误差(mean square error),x和y具有相同的Size
损失函数如下定义:
ℓ ( x , y ) = L = { l 1 , … , l N } ⊤ , l n = ( x n − y n ) 2 \ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = ( x_n - y_n )^2 ℓ(x,y)=L={l1,…,lN}⊤,ln=(xn−yn)2
如果reduction != ‘none’:
ℓ ( x , y ) = { mean ( L ) , if    reduction = ’elementwise_mean’ , sum ( L ) , if    reduction = ’sum’ . \ell(x, y) = \begin{cases} \operatorname{mean}(L), & \text{if}\; \text{reduction} = \text{'elementwise\_mean'},\\ \operatorname{sum}(L), & \text{if}\; \text{reduction} = \text{'sum'}. \end{cases} ℓ(x,y)={mean(L),sum(L),ifreduction=’elementwise_mean’,ifreduction=’sum’.
shape: N 为一批数据的数量
通过代码来看一基本用法,以及reduction参数对返回结果的影响:
import torch
from torch import nn
criterion_none = nn.MSELoss( reduction='none')
criterion_elementwise_mean = nn.MSELoss(reduction='elementwise_mean')
criterion_sum = nn.MSELoss(reduction='sum')
x = torch.randn(3, 2, requires_grad=True)
y = torch.randn(3, 2)
loss_none = criterion_none(x, y)
loss_elementwise_mean = criterion_elementwise_mean(x, y)
loss_sum = criterion_sum(x, y )
print('reduction={}: {}'.format('none', loss_none.detach().numpy()))
print('reduction={}: {}'.format('elementwise_mean', loss_elementwise_mean.item()))
print('reduction={}: {}'.format('sum', loss_sum.item()))
out:
reduction=none:
[[0.02320575 0.30483633]
[0.04768182 0.4319028 ]
[3.11864 7.9872203 ]]
reduction=elementwise_mean: 1.9855811595916748 # 1.9 * 6 = 11.4
reduction=sum: 11.913487434387207
该博客讲述了交叉熵的定义以及为何使用交叉熵,对交叉熵不是很了解的可以看一下:
http://jackon.me/posts/why-use-cross-entropy-error-for-loss-function/
这篇博客讲了如何求交叉熵损失函数的导数,有兴趣的可以看一下:
https://zhuanlan.zhihu.com/p/35709485
class torch.nn.BCELoss(weight=None, size_average=None, reduce=None, reduction='elementwise_mean')
二分类的交叉熵函数。使用该函数之前先计算Sigmoid值。https://pytorch.org/docs/stable/nn.html#torch.nn.Sigmoid
损失函数表达式如下:
ℓ ( x , y ) = L = { l 1 , … , l N } ⊤ , l n = − w n [ y n ⋅ log x n + ( 1 − y n ) ⋅ log ( 1 − x n ) ] , \ell(x, y) = L = \{l_1,\dots,l_N\}^\top, \quad l_n = - w_n \left[ y_n \cdot \log x_n + (1 - y_n) \cdot \log (1 - x_n) \right], ℓ(x,y)=L={l1,…,lN}⊤,ln=−wn[yn⋅logxn+(1−yn)⋅log(1−xn)],
如果reduction != ‘none’:
ℓ ( x , y ) = { mean ( L ) , if    reduction = ’elementwise_mean’ , sum ( L ) , if    reduction = ’sum’ . \ell(x, y) = \begin{cases} \operatorname{mean}(L), & \text{if}\; \text{reduction } = \text{'elementwise\_mean'},\\ \operatorname{sum}(L), & \text{if}\; \text{reduction } = \text{'sum'}. \end{cases} ℓ(x,y)={mean(L),sum(L),ifreduction =’elementwise_mean’,ifreduction =’sum’.
shape: N 为一批数据的数量
random_:https://pytorch.org/docs/master/tensors.html#torch.Tensor.random_
https://pytorch.org/docs/master/torch.html#in-place-random-sampling
import torch
from torch import nn
m = nn.Sigmoid()
criterion = nn.BCELoss()
x = torch.randn(3, requires_grad=True)
# random_(from=0, to): 按均匀分布从[from, to -1]内去出离散整数值
# 将y取0或1
y = torch.empty(3).random_(2)
# 在计算前线计算x的sigmoid值
loss = criterion(m(x), y)
print(loss.item())
out:
0.8146645426750183
class torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='elementwise_mean')
https://pytorch.org/docs/stable/nn.html#torch.nn.CrossEntropyLoss
在使用该函数前不需要经过softmax计算, target不是one_hot编码格式
shape: N是一批数据的数量
loss表达式:
loss ( x , c l a s s ) = − log ( exp ( x [ c l a s s ] ) ∑ j exp ( x [ j ] ) ) = − x [ c l a s s ] + log ( ∑ j exp ( x [ j ] ) ) \text{loss}(x, class) = -\log\left(\frac{\exp(x[class])}{\sum_j \exp(x[j])}\right) = -x[class] + \log\left(\sum_j \exp(x[j])\right) loss(x,class)=−log(∑jexp(x[j])exp(x[class]))=−x[class]+log(j∑exp(x[j]))
import torch
from torch import nn
criterion = nn.CrossEntropyLoss()
x = torch.randn(3, 5, requires_grad=True)
# random_(from=0, to): 按均匀分布从[from, to -1]内去出离散整数值
# 将y取0或1
y = torch.empty(3, dtype=torch.long).random_(5)
# 在计算前线计算x的sigmoid值
loss = criterion(x, y)
print(loss.item())
out:
1.887318730354309
如果传给target是one_hot编码格式呢?
代码中主要涉及了
one_hot
形式,使用scatter_
:
import torch
from torch import nn
from torch.nn import functional as F
# 编码one_hot
def one_hot(y):
'''
y: (N)的一维tensor,值为每个样本的类别
out:
y_onehot: 转换为one_hot 编码格式
'''
y = y.view(-1, 1)
y_onehot = torch.FloatTensor(3, 5)
# In your for loop
y_onehot.zero_()
y_onehot.scatter_(1, y, 1)
return y_onehot
def cross_entropy_one_hot(input_, target):
# 解码
_, labels = target.max(dim=1)
# 调用cross_entropy
return F.cross_entropy(input_, labels)
# 调用计算loss: loss_1 = cross_entropy_one_hot(x, one_hot(y))
多类的crossentropy:
[0.1, 0.2, 0.7] (prediction) ------------------ [1.0, 0.0, 0.0] (target)
则损失函数为: - (1.0 * log(0.1) + 0.0 * log(0.2) + 0.0 * log(0.7))
def cross_entropy(input_, target, reduction='elementwise_mean'):
""" Cross entropy that accepts soft targets
Args:
pred: predictions for neural network
targets: targets, can be soft
size_average: if false, sum is returned instead of mean
Examples::
input = torch.FloatTensor([[1.1, 2.8, 1.3], [1.1, 2.1, 4.8]])
input = torch.autograd.Variable(out, requires_grad=True)
target = torch.FloatTensor([[0.05, 0.9, 0.05], [0.05, 0.05, 0.9]])
target = torch.autograd.Variable(y1)
loss = cross_entropy(input, target)
loss.backward()
"""
logsoftmax = nn.LogSoftmax(dim=1)
res =-target * logsoftmax(input_)
if reduction == 'elementwise_mean':
return torch.mean(torch.sum(res, dim=1))
elif reduction == 'sum':
return torch.sum(torch.sum(res, dim=1))
else:
return res
最后附上三种求loss的的代码:
import torch
from torch import nn
from torch.nn import functional as F
x = torch.randn(3, 5, requires_grad=True)
# random_(from=0, to): 按均匀分布从[from, to -1]内去出离散整数值
# 将y取0或1
y = torch.empty(3, dtype=torch.long).random_(5)
def one_hot(y):
'''
y: (N)的一维tensor,值为每个样本的类别
out:
y_onehot: 转换为one_hot 编码格式
'''
y = y.view(-1, 1)
y_onehot = torch.FloatTensor(3, 5)
# In your for loop
y_onehot.zero_()
y_onehot.scatter_(1, y, 1)
return y_onehot
# 1. 自带的函数
criterion = nn.CrossEntropyLoss()
# 2.将one_hot解码后使用pytorch自带的cross_entropy
def cross_entropy_one_hot(input, target):
_, labels = target.max(dim=1)
return F.cross_entropy(input, labels)
# 也可以使用以下语句
# return nn.CrossEntropyLoss()(input, labels)
# 3. 自定义
def cross_entropy(input_, target, reduction='elementwise_mean'):
""" Cross entropy that accepts soft targets
Args:
pred: predictions for neural network
targets: targets, can be soft
size_average: if false, sum is returned instead of mean
Examples::
input = torch.FloatTensor([[1.1, 2.8, 1.3], [1.1, 2.1, 4.8]])
input = torch.autograd.Variable(out, requires_grad=True)
target = torch.FloatTensor([[0.05, 0.9, 0.05], [0.05, 0.05, 0.9]])
target = torch.autograd.Variable(y1)
loss = cross_entropy(input, target)
loss.backward()
"""
logsoftmax = nn.LogSoftmax(dim=1)
res =-target * logsoftmax(input_)
if reduction == 'elementwise_mean':
return torch.mean(torch.sum(res, dim=1))
elif reduction == 'sum':
return torch.sum(torch.sum(res, dim=1))
else:
return res
loss = criterion(x, y)
print("loss",loss.item())
loss_1 = cross_entropy_one_hot(x, one_hot(y))
print("loss_one_hot", loss_1.item())
loss_2 = cross_entropy(x, one_hot(y))
print("loss_custom", loss_2.item())
out:
loss 3.0437448024749756
loss_one_hot 3.0437448024749756
loss_custom 3.0437448024749756
参考:
https://discuss.pytorch.org/t/cross-entropy-with-one-hot-targets/13580/3