NLLLoss
负对数似然损失函数,用于处理多分类问题,输入是对数化的概率值。
对于包含个样本的batch数据 , 是神经网络的输出,并进行归一化和对数化处理。是样本对应的类别标签,每个样本可能是种类别中的一个。
为第个样本对应的,
用于多个类别之间样本不平衡问题:
weight ignore_index
class NLLLoss(_WeightedLoss):
__constants__ = ['ignore_index', 'reduction']
ignore_index: int
def __init__(self, weight: Optional[Tensor] = None, size_average=None, ignore_index: int = -100,
reduce=None, reduction: str = 'mean') -> None:
super(NLLLoss, self).__init__(weight, size_average, reduce, reduction)
self.ignore_index = ignore_index
def forward(self, input: Tensor, target: Tensor) -> Tensor:
assert self.weight is None or isinstance(self.weight, Tensor)
return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)
pytorch中通过torch.nn.NLLLoss
类实现,也可以直接调用F.nll_loss
函数,代码中的weight
即是。size_average
与reduce
已经弃用。reduction有三种取值mean
, sum
, none
,对应不同的返回. 默认为mean
,对应于一般情况下整体的计算。
参数ignore_index
对应于忽视的类别,即该类别的误差不计入, 默认为-100
,例如,将padding处的类别设置为ignore_index
LogSoftmax
pytorch中使用torch.nn.LogSoftmax
函数对神经网络的输出进行归一化和对数化
CrossEntropyLoss
交叉熵损失函数,用于处理多分类问题,输入是未归一化神经网络输出。
对于包含个样本的batch数据 , 是神经网络未归一化的输出。是样本对应的类别标签,每个样本可能是种类别中的一个。
为第个样本对应的,
class CrossEntropyLoss(_WeightedLoss):
__constants__ = ['ignore_index', 'reduction']
ignore_index: int
def __init__(self, weight: Optional[Tensor] = None, size_average=None, ignore_index: int = -100,
reduce=None, reduction: str = 'mean') -> None:
super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)
self.ignore_index = ignore_index
def forward(self, input: Tensor, target: Tensor) -> Tensor:
assert self.weight is None or isinstance(self.weight, Tensor)
return F.cross_entropy(input, target, weight=self.weight,
ignore_index=self.ignore_index, reduction=self.reduction)
pytorch中通过torch.nn.CrossEntropyLoss
类实现,也可以直接调用F.cross_entropy
函数,代码中的weight
即是。size_average
与reduce
已经弃用。reduction有三种取值mean
, sum
, none
,对应不同的返回. 默认为mean
,对应于一般情况下整体的计算。
验证:
import torch
import torch.nn as nn
# 多分类
m = torch.nn.LogSoftmax(dim=1)
loss_nll_fct = nn.NLLLoss(reduction="mean")
loss_ce_fct = nn.CrossEntropyLoss(reduction="mean")
input_src = torch.Tensor([[0.8, 0.9, 0.3], [0.8, 0.9, 0.3], [0.8, 0.9, 0.3], [0.8, 0.9, 0.3]])
target = torch.Tensor([1, 1, 0, 0]).long()
# 4个样本,3分类
print(input_src.size())
print(target.size())
output = m(input_src)
loss_nll = loss_nll_fct(output, target)
print(loss_nll.item())
# 验证是否一致
loss_ce = loss_ce_fct(input_src, target)
print(loss_ce.item())
torch.Size([4, 3])
torch.Size([4])
0.9475762844085693
0.9475762844085693