pytorch 交叉熵损失函数解析

1,二分类交叉熵损失,BCELoss

1.1 数学理论
B C E L o s s = − [ Y n ∗ l o g X n + ( 1 − Y n ) ∗ l o g ( 1 − X n ) ] BCELoss = - [ Y_{n} * logX_{n} + (1 - Y_{n}) * log(1-X_{n}) ] BCELoss=[YnlogXn+(1Yn)log(1Xn)]

注意, Y n Y_{n} Yn为类别,通常为索引。 X n X_{n} Xn为概率,不能为1;若越界,则出错,inf

1.2 代码测试

xn = torch.FloatTensor([0.8])
yn = torch.FloatTensor([2])
bceloss = nn.BCELoss()
out1 = bceloss(xn,yn)
print(out1)

print('-' * 10)
Xn = 0.8
Yn = 2
out2 = - (Yn * np.log(Xn) + (1 - Yn) * np.log(1 - Xn))
print(out2)

输出结果
tensor(-1.1632)
-1.163150809805681

2. 多分类交叉熵损失 CrossEntropyLoss

2.1 数学理论
l o s s ( x , c l a s s ) = − l o g e x p ( x [ c l a s s ] ) ∑ j e x p ( [ j ] ) = − x [ c l a s s ] + l o g ( ∑ j e x p ( x [ j ] ) ) loss(x, class) = -log\frac{exp(x[class])}{\sum_{j}{exp([j])}} = -x[class] + log(\sum_{j}{exp(x[j])}) loss(x,class)=logjexp([j])exp(x[class])=x[class]+log(jexp(x[j]))

注意, x x x为输入,通常为多维矩阵。 c l a s s class class为类别,通常为数。

2.2 代码测试

# -*- coding: utf-8 -*-
import numpy as np
import torch
import torch.nn as nn
from torch.autograd import Variable

loss = nn.CrossEntropyLoss(size_average=True)
xn = Variable(torch.FloatTensor([[1,2]]), requires_grad=True)
yn = Variable(torch.LongTensor([1]))

output = loss(xn, yn)
output.backward()
print(output)

print('-' * 10)
Xn = np.array([[1,2]])
Yn = np.array([1])
out2 = -Xn[:,Yn] + np.log(np.exp(Xn[:,0]) + np.exp(Xn[:,1]))
print(out2)

输出结果:
tensor(0.3133, grad_fn=<NllLossBackward>)
----------
[[0.31326169]]

你可能感兴趣的:(pytorch)