ps:由于降阳性这步正负样本数量在差距巨大.正样本1500多个,而负样本750000多个.要用 Focal Loss来解决这个问题.
首先感谢Code_Mart的博客把理论汇总了下https://blog.csdn.net/Code_Mart/article/details/89736187.并实现了Focal Loss的二分类和多分类的代码并讲解.同时他与xwmwanjy666的讨论更讲清楚了一些细节问题.
但是我感觉代码不符合pytorch0.4.1版本,而在他们俩的对话里发现了https://github.com/ronghuaiyang/arcface-pytorch/blob/master/models/focal_loss.py我按思路稍微改了下.
import torch
import torch.nn as nn
class FocalLoss(nn.Module):
def __init__(self, gamma=0,alpha=1):
super(FocalLoss, self).__init__()
self.gamma = gamma
self.ce = nn.CrossEntropyLoss()
self.alpha=alpha
def forward(self, input, target):
logp = self.ce(input, target)
p = torch.exp(-logp)
loss = (1 - p) ** self.gamma * logp
loss = self.alpha*loss
return loss.mean()
这么简单这才是我想要的么,但是后来继续看他们的讨论对照一下也发现了没有做当(ground truth为1,alpha=a;当ground truth为0,alpha=1-a).搞得我很难受,然后我花了4个多小时在github上搜索,发现基本要么没考虑这个问题,要么代码很复杂(看不太懂,输入也不太符合我的要求)同时不是应用在分类问题上,而且用的函数pytorch版本一般都是0.4以下.
直到我发现了https://github.com/louis-she/focal-loss.pytorch/blob/master/focal_loss.py
import torch
import torch.nn.functional as F
class BCEFocalLoss(torch.nn.Module):
def __init__(self, gamma=2, alpha=None, reduction='elementwise_mean'):
super().__init__()
self.gamma = gamma
self.alpha = alpha
self.reduction = reduction
def forward(self, _input, target):
pt = torch.sigmoid(_input)
loss = - (1 - pt) ** self.gamma * target * torch.log(pt) - \
pt ** self.gamma * (1 - target) * torch.log(1 - pt)
if self.alpha:
loss = loss * self.alpha
if self.reduction == 'elementwise_mean':
loss = torch.mean(loss)
elif self.reduction == 'sum':
loss = torch.sum(loss)
return loss
loss = - (1 - pt) ** self.gamma * target * torch.log(pt) -pt ** self.gamma * (1 - target) * torch.log(1 - pt).这也可以解决上面的问题.我不太理解作者没有改相应(ground truth为1,alpha=a;当ground truth为0,alpha=1-a).但是借鉴了这个思路将第一份代码改成如下:
import torch
import torch.nn as nn
class FocalLoss(nn.Module):
def __init__(self, gamma=2,alpha=0.25):
super(FocalLoss, self).__init__()
self.gamma = gamma
self.ce = nn.CrossEntropyLoss()
self.alpha=alpha
def forward(self, input, target):
logp = self.ce(input, target)
p = torch.exp(-logp)
loss = self.alpha*(1 - p) ** self.gamma * logp * target.long() + \
(1-self.alpha)*(p) ** self.gamma * logp * (1-target.long())
return loss.mean()
也不知道这样写有没有问题.
ps添加:2019年6月14日:
用Focal Loss 分类问题 pytorch实现代码续3的测试代码测试借鉴了这个思路将第一份代码改的代码(上面最后的代码)
中间结果是上面的loss:
input=torch.Tensor([[ 0.0543, 0.5641],[ 1.2221, -0.5496],[-0.7951, -0.1546],[-0.4557, 1.4724]])
target= torch.Tensor([1,0,1,1])
tensor([[0.3752, 0.6248],
[0.8547, 0.1453],
[0.3451, 0.6549],
[0.1270, 0.8730]])
tensor(0.0080)
tensor(0.0344)
tensor(0.2966)
target= torch.Tensor([0,1,0,0])
tensor([[0.3752, 0.6248],
[0.8547, 0.1453],
[0.3451, 0.6549],
[0.1270, 0.8730]])
tensor(0.5403)
tensor(0.0987)
tensor(1.5092)
从上面的结果看出效果不好第一个标签是对应预测概率,故loss该很小,第二个标签与预测概率相反,故loss该很大.虽然趋势一致.但是相对倍数大概分别是70倍,3倍,5倍.那还不如用自带的损失函数呢.故最后选用用Focal Loss 分类问题 pytorch实现代码续3中的结论代码:
import torch
import torch.nn as nn
#二分类
class FocalLoss(nn.Module):
def __init__(self, gamma=2,alpha=0.25):
super(FocalLoss, self).__init__()
self.gamma = gamma
self.alpha=alpha
def forward(self, input, target):
# input:size is M*2. M is the batch number
# target:size is M.
pt=torch.softmax(input,dim=1)
p=pt[:,1]
loss = -self.alpha*(1-p)**self.gamma*(target*torch.log(p))-\
(1-self.alpha)*p**self.gamma*((1-target)*torch.log(1-p))
return loss.mean()