损失函数整理


DSH中的哈希函数

Lr(b1,b2,y)=Ni=1{12(1yi)||bi,1bi,2||22+12yimax(m||bi,1bi,2||22,0)+α(|||bi,1|1||1+|||bi,2|1||1)}

Specially, for a pair of images I1,I2Ω and the corresponding binary network outputs b1,b2 , we dfine y=0 if they are similar, and y=1 otherwise.


L1Loss

creates a criterion that measures the mean absolute value of the element-wise difference between input x and target y:
loss(x,y)=1n|xiyi|


MSELoss

loss(x,y)=1n|xiyi|2
measures the mean squared error between n elements in the input x and target y .


CrossEntropyLoss

the loss can be described as :
loss(x,class)=logexclassjexj=xclass+log(jexj)
or in the case of the weights argument being specified :
loss(x,class)=wclass(xclass+log(jexj))


NLLLoss

the loss can be described as :
loss(x,class)=xclass
or ine the case of the weights argument it is specified as follows:
loss(x,class)=wclassxclass


KLDivLoss

the loss can be described as:
loss(x,target)=1n(targeti(log(targeti)xi))


BCELoss

creates a criterion that measures the Binary Cross Entropy between the target and the output:
loss(o,t)=1ni(tilog(oi)+(1ti)log(1oi))
or in the case of the weights argument being specified:
loss(o,t)=1niwi(tilog(oi)+(1ti)log(1oi))
This is used for measureing the error of a reconstruction in for example an auto-encoder. Note that the targets ti should be numbers between 0 and 1.


BCEWithLogitsLoss

this Binary Cross Entropy between the target and the output logits( no sigmoid applied) is:
loss(o,t)=1ni(tilog(sigmoid(oi))+(1ti)log(1sigmoid(oi)))
or in the case of the weights argument being specified:
loss(o,t)=1niwi(tilog(sigmoid(oi))+(1ti)log(1sigmoid(oi)))


MarginRankingLoss

if y==1 then it assumed the first input should be ranked higher(have a larger ) than the second input , and vice-versa for y==-1.
the loss function for each sample in the mini-batch is:
loss(x,y)=max(0,y(x1x2)+margin)


HingeEmbeddingLoss

loss(x,y)=1n{xi, max(0,marginxi),yi==1yi==1


MultiLLabelMarginLoss

loss(x,y)=i,j(max(0,1(xyjxi)))x.size(0)
where i=0 to x.size(0), j=0 to y.size(0), yj!=0 , and i!=yj for all i and j.
x and y must have the same size.
The criterion only considers the first non zero yj targets. This allows for different samples to have variable amounts of target calsses


SmoothL1Loss

Creates a criterion that uses a squared term if the absolute element-wise error falls below 1 and an L1 term otherwise. It is less sensitive to outliers than the MSELoss and in some cases prevents exploding gradients (e.g. see “Fast R-CNN” paper by Ross Girshick). Also known as the Huber loss:
loss(x,y)=1n{0.5(xiyi)2,|xiyi|<1 |xiyi|0.5,otherwise


SoftMarginLoss

loss(x,y)=ilog(1+eyixi)x.nelement()


MultiLabelSoftMarginLoss

loss(x,y)=i(yilog11+exi+(1yi)logexi1+exi)
where i==0 to x.nelement-1, yi in {0, 1}
and x must have the same size.


CosineEmbeddingLoss

loss(x,y)={1cos(x1,x2),y==1 max(0,cos(x1,x2)margin),y==1


MultiMarginLoss

loss(x,y)=imax(0,(marginxy+xi))px.size(0)
where i==0 to x.size(0) and i!=y.
Optionally, you can give non-equal weighting on classes by passing a 1D weights tensor into the constructor.
The loss function then becomes:
loss(x,y)=imax(0,wy(marginxy+xi))px.size(0)


TripletMarginLoss

The distance swap is described in detail in the paper Learning shallow convolutional feature descriptors with triplet losses by V.Balntas, E.Riba et al.
L(a,p,n)=1N(Ni=1max{d(ai,pi)d(ai,ni)+margin,0})
where d(xi,yi)=||xiyi||22

你可能感兴趣的:(机器学习)