loss函数之NLLLoss,CrossEntropyLoss

NLLLoss

负对数似然损失函数,用于处理多分类问题,输入是对数化的概率值。

对于包含 N N N个样本的batch数据 D ( x , y ) D(x, y) D(x,y) x x x 是神经网络的输出,进行了归一化和对数化处理。 y y y是样本对应的类别标签,每个样本可能是 C C C种类别中的一个。

l n l_{n} ln 为第 n n n个样本对应的 l o s s loss loss 0 ≤ y n ≤ C − 1 0 \leq y_{n} \leq C-1 0ynC1

l n = − w y n x n , y n l_{n}=-w_{y_{n}} x_{n, y_{n}} ln=wynxn,yn

w e i g h t weight weight 用于处理多个类别之间样本不平衡问题:

w c = w e i g h t [ c ] ⋅ 1 { c ≠ w_{c}=weight[c] \cdot 1\{c \neq wc=weight[c]1{c= ignore_index } \} }

class NLLLoss(_WeightedLoss):
    __constants__ = ['ignore_index', 'reduction']
    ignore_index: int
    def __init__(self, weight: Optional[Tensor] = None, size_average=None, ignore_index: int = -100,
                 reduce=None, reduction: str = 'mean') -> None:
        super(NLLLoss, self).__init__(weight, size_average, reduce, reduction)
        self.ignore_index = ignore_index
    def forward(self, input: Tensor, target: Tensor) -> Tensor:
        assert self.weight is None or isinstance(self.weight, Tensor)
        return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)

pytorch中通过torch.nn.NLLLoss类实现,也可以直接调用F.nll_loss 函数。size_averagereduce已经弃用。reduction有三种取值mean, sum, none,对应不同的返回 ℓ ( x , y ) \ell(x, y) (x,y)。默认为mean,对应于一般情况下整体 l o s s loss loss的计算。

L = { l 1 , … , l N } L=\left\{l_{1}, \ldots, l_{N}\right\} L={l1,,lN}

ℓ ( x , y ) = { L ,  if reduction  =  ’none’  ∑ n = 1 N 1 ∑ n = 1 N w y n l n ,  if reduction  =  ’mean’  ∑ n = 1 N l n ,  if reduction  =  ’sum’  \ell(x, y)=\left\{\begin{array}{ll} L, & \text { if reduction }=\text { 'none' } \\ \sum_{n=1}^{N} \frac{1}{\sum_{n=1}^{N} w_{y_{n}}} l_{n}, & \text { if reduction }=\text { 'mean' } \\ \sum_{n=1}^{N} l_{n}, & \text { if reduction }=\text { 'sum' }\end{array}\right. (x,y)=L,n=1Nn=1Nwyn1ln,n=1Nln, if reduction = ’none’  if reduction = ’mean’  if reduction = ’sum’ 

参数ignore_index对应于忽视的类别,即该类别的误差不计入 l o s s loss loss, 默认为-100,例如,将padding处的类别设置为ignore_index

LogSoftmax

pytorch中使用torch.nn.LogSoftmax函数对神经网络的输出进行归一化和对数化

LogSoftmax ⁡ ( x i ) = log ⁡ ( exp ⁡ ( x i ) ∑ j exp ⁡ ( x j ) ) \operatorname{LogSoftmax}\left(x_{i}\right)=\log \left(\frac{\exp \left(x_{i}\right)}{\sum_{j} \exp \left(x_{j}\right)}\right) LogSoftmax(xi)=log(jexp(xj)exp(xi))

CrossEntropyLoss

交叉熵损失函数,用于处理多分类问题,输入是未归一化神经网络输出。

 CrossEntropyLoss  ( x , y ) = N L L Loss ⁡ ( log ⁡ Softmax ⁡ ( x ) , y ) \text { CrossEntropyLoss }(x, y)=N L L \operatorname{Loss}(\log \operatorname{Softmax}(x), y)  CrossEntropyLoss (x,y)=NLLLoss(logSoftmax(x),y)

对于包含 N N N个样本的batch数据 D ( x , y ) D(x, y) D(x,y) x x x 是神经网络未归一化的输出。 y y y是样本对应的类别标签,每个样本可能是 C C C种类别中的一个。

l n l_{n} ln 为第 n n n个样本对应的 l o s s loss loss 0 ≤ y n ≤ C − 1 0 \leq y_{n} \leq C-1 0ynC1

l n = − w y n ( log ⁡ exp ⁡ ( x n , y n ) ∑ j = 0 C − 1 exp ⁡ ( x n , j ) ) l_{n} =-w_{y_{n}}(\log \frac{\exp(x_{n, y_{n}})}{\sum_{j=0}^{C-1} \exp (x_{n,j})}) ln=wyn(logj=0C1exp(xn,j)exp(xn,yn))

class CrossEntropyLoss(_WeightedLoss):
    __constants__ = ['ignore_index', 'reduction']
    ignore_index: int
    def __init__(self, weight: Optional[Tensor] = None, size_average=None, ignore_index: int = -100,
                 reduce=None, reduction: str = 'mean') -> None:
        super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)
        self.ignore_index = ignore_index
    def forward(self, input: Tensor, target: Tensor) -> Tensor:
        assert self.weight is None or isinstance(self.weight, Tensor)
        return F.cross_entropy(input, target, weight=self.weight,
                               ignore_index=self.ignore_index, reduction=self.reduction)

pytorch中通过torch.nn.CrossEntropyLoss类实现,也可以直接调用F.cross_entropy 函数。size_averagereduce已经弃用。reduction有三种取值mean, sum, none,对应不同的返回 ℓ ( x , y ) \ell(x, y) (x,y). 默认为mean,对应于一般情况下整体 l o s s loss loss的计算。

L = { l 1 , … , l N } L=\left\{l_{1}, \ldots, l_{N}\right\} L={l1,,lN}

ℓ ( x , y ) = { L ,  if reduction  =  ’none’  ∑ n = 1 N 1 ∑ n = 1 N w y n l n ,  if reduction  =  ’mean’  ∑ n = 1 N l n ,  if reduction  =  ’sum’  \ell(x, y)=\left\{\begin{array}{ll} L, & \text { if reduction }=\text { 'none' } \\ \sum_{n=1}^{N} \frac{1}{\sum_{n=1}^{N} w_{y_{n}}} l_{n}, & \text { if reduction }=\text { 'mean' } \\ \sum_{n=1}^{N} l_{n}, & \text { if reduction }=\text { 'sum' }\end{array}\right. (x,y)=L,n=1Nn=1Nwyn1ln,n=1Nln, if reduction = ’none’  if reduction = ’mean’  if reduction = ’sum’ 

验证  CrossEntropyLoss  ( x , y ) = N L L Loss ⁡ ( log ⁡ Softmax ⁡ ( x ) , y ) \text { CrossEntropyLoss }(x, y)=N L L \operatorname{Loss}(\log \operatorname{Softmax}(x), y)  CrossEntropyLoss (x,y)=NLLLoss(logSoftmax(x),y)

import torch
import torch.nn as nn

# 多分类
m = torch.nn.LogSoftmax(dim=1)
loss_nll_fct = nn.NLLLoss(reduction="mean")
loss_ce_fct = nn.CrossEntropyLoss(reduction="mean")
input_src = torch.Tensor([[0.8, 0.9, 0.3], [0.8, 0.9, 0.3], [0.8, 0.9, 0.3], [0.8, 0.9, 0.3]])
target = torch.Tensor([1, 1, 0, 0]).long()
# 4个样本,3分类
print(input_src.size())
print(target.size())
output = m(input_src)
loss_nll = loss_nll_fct(output, target)
print(loss_nll.item())
# 验证是否一致
loss_ce = loss_ce_fct(input_src, target)
print(loss_ce.item())
torch.Size([4, 3])
torch.Size([4])
0.9475762844085693
0.9475762844085693

你可能感兴趣的:(深度学习理论,负对数似然,交叉熵损失,nllloss,crossentropy)