负对数似然损失函数,用于处理多分类问题,输入是对数化的概率值。
对于包含 N N N个样本的batch数据 D ( x , y ) D(x, y) D(x,y), x x x 是神经网络的输出,进行了归一化和对数化处理。 y y y是样本对应的类别标签,每个样本可能是 C C C种类别中的一个。
l n l_{n} ln 为第 n n n个样本对应的 l o s s loss loss, 0 ≤ y n ≤ C − 1 0 \leq y_{n} \leq C-1 0≤yn≤C−1
l n = − w y n x n , y n l_{n}=-w_{y_{n}} x_{n, y_{n}} ln=−wynxn,yn
w e i g h t weight weight 用于处理多个类别之间样本不平衡问题:
w c = w e i g h t [ c ] ⋅ 1 { c ≠ w_{c}=weight[c] \cdot 1\{c \neq wc=weight[c]⋅1{c= ignore_index } \} }
class NLLLoss(_WeightedLoss):
__constants__ = ['ignore_index', 'reduction']
ignore_index: int
def __init__(self, weight: Optional[Tensor] = None, size_average=None, ignore_index: int = -100,
reduce=None, reduction: str = 'mean') -> None:
super(NLLLoss, self).__init__(weight, size_average, reduce, reduction)
self.ignore_index = ignore_index
def forward(self, input: Tensor, target: Tensor) -> Tensor:
assert self.weight is None or isinstance(self.weight, Tensor)
return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)
pytorch中通过torch.nn.NLLLoss
类实现,也可以直接调用F.nll_loss
函数。size_average
与reduce
已经弃用。reduction有三种取值mean
, sum
, none
,对应不同的返回 ℓ ( x , y ) \ell(x, y) ℓ(x,y)。默认为mean
,对应于一般情况下整体 l o s s loss loss的计算。
L = { l 1 , … , l N } L=\left\{l_{1}, \ldots, l_{N}\right\} L={l1,…,lN}
ℓ ( x , y ) = { L , if reduction = ’none’ ∑ n = 1 N 1 ∑ n = 1 N w y n l n , if reduction = ’mean’ ∑ n = 1 N l n , if reduction = ’sum’ \ell(x, y)=\left\{\begin{array}{ll} L, & \text { if reduction }=\text { 'none' } \\ \sum_{n=1}^{N} \frac{1}{\sum_{n=1}^{N} w_{y_{n}}} l_{n}, & \text { if reduction }=\text { 'mean' } \\ \sum_{n=1}^{N} l_{n}, & \text { if reduction }=\text { 'sum' }\end{array}\right. ℓ(x,y)=⎩⎪⎨⎪⎧L,∑n=1N∑n=1Nwyn1ln,∑n=1Nln, if reduction = ’none’ if reduction = ’mean’ if reduction = ’sum’
参数ignore_index
对应于忽视的类别,即该类别的误差不计入 l o s s loss loss, 默认为-100
,例如,将padding处的类别设置为ignore_index
pytorch中使用torch.nn.LogSoftmax
函数对神经网络的输出进行归一化和对数化
LogSoftmax ( x i ) = log ( exp ( x i ) ∑ j exp ( x j ) ) \operatorname{LogSoftmax}\left(x_{i}\right)=\log \left(\frac{\exp \left(x_{i}\right)}{\sum_{j} \exp \left(x_{j}\right)}\right) LogSoftmax(xi)=log(∑jexp(xj)exp(xi))
交叉熵损失函数,用于处理多分类问题,输入是未归一化神经网络输出。
CrossEntropyLoss ( x , y ) = N L L Loss ( log Softmax ( x ) , y ) \text { CrossEntropyLoss }(x, y)=N L L \operatorname{Loss}(\log \operatorname{Softmax}(x), y) CrossEntropyLoss (x,y)=NLLLoss(logSoftmax(x),y)
对于包含 N N N个样本的batch数据 D ( x , y ) D(x, y) D(x,y), x x x 是神经网络未归一化的输出。 y y y是样本对应的类别标签,每个样本可能是 C C C种类别中的一个。
l n l_{n} ln 为第 n n n个样本对应的 l o s s loss loss, 0 ≤ y n ≤ C − 1 0 \leq y_{n} \leq C-1 0≤yn≤C−1
l n = − w y n ( log exp ( x n , y n ) ∑ j = 0 C − 1 exp ( x n , j ) ) l_{n} =-w_{y_{n}}(\log \frac{\exp(x_{n, y_{n}})}{\sum_{j=0}^{C-1} \exp (x_{n,j})}) ln=−wyn(log∑j=0C−1exp(xn,j)exp(xn,yn))
class CrossEntropyLoss(_WeightedLoss):
__constants__ = ['ignore_index', 'reduction']
ignore_index: int
def __init__(self, weight: Optional[Tensor] = None, size_average=None, ignore_index: int = -100,
reduce=None, reduction: str = 'mean') -> None:
super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)
self.ignore_index = ignore_index
def forward(self, input: Tensor, target: Tensor) -> Tensor:
assert self.weight is None or isinstance(self.weight, Tensor)
return F.cross_entropy(input, target, weight=self.weight,
ignore_index=self.ignore_index, reduction=self.reduction)
pytorch中通过torch.nn.CrossEntropyLoss
类实现,也可以直接调用F.cross_entropy
函数。size_average
与reduce
已经弃用。reduction有三种取值mean
, sum
, none
,对应不同的返回 ℓ ( x , y ) \ell(x, y) ℓ(x,y). 默认为mean
,对应于一般情况下整体 l o s s loss loss的计算。
L = { l 1 , … , l N } L=\left\{l_{1}, \ldots, l_{N}\right\} L={l1,…,lN}
ℓ ( x , y ) = { L , if reduction = ’none’ ∑ n = 1 N 1 ∑ n = 1 N w y n l n , if reduction = ’mean’ ∑ n = 1 N l n , if reduction = ’sum’ \ell(x, y)=\left\{\begin{array}{ll} L, & \text { if reduction }=\text { 'none' } \\ \sum_{n=1}^{N} \frac{1}{\sum_{n=1}^{N} w_{y_{n}}} l_{n}, & \text { if reduction }=\text { 'mean' } \\ \sum_{n=1}^{N} l_{n}, & \text { if reduction }=\text { 'sum' }\end{array}\right. ℓ(x,y)=⎩⎪⎨⎪⎧L,∑n=1N∑n=1Nwyn1ln,∑n=1Nln, if reduction = ’none’ if reduction = ’mean’ if reduction = ’sum’
验证 CrossEntropyLoss ( x , y ) = N L L Loss ( log Softmax ( x ) , y ) \text { CrossEntropyLoss }(x, y)=N L L \operatorname{Loss}(\log \operatorname{Softmax}(x), y) CrossEntropyLoss (x,y)=NLLLoss(logSoftmax(x),y):
import torch
import torch.nn as nn
# 多分类
m = torch.nn.LogSoftmax(dim=1)
loss_nll_fct = nn.NLLLoss(reduction="mean")
loss_ce_fct = nn.CrossEntropyLoss(reduction="mean")
input_src = torch.Tensor([[0.8, 0.9, 0.3], [0.8, 0.9, 0.3], [0.8, 0.9, 0.3], [0.8, 0.9, 0.3]])
target = torch.Tensor([1, 1, 0, 0]).long()
# 4个样本,3分类
print(input_src.size())
print(target.size())
output = m(input_src)
loss_nll = loss_nll_fct(output, target)
print(loss_nll.item())
# 验证是否一致
loss_ce = loss_ce_fct(input_src, target)
print(loss_ce.item())
torch.Size([4, 3])
torch.Size([4])
0.9475762844085693
0.9475762844085693