关于“Deep Variational Metric Learning”

论文阅读笔记

  1. 论文标题:Deep Variational Metric Learning
  2. 动机:现有的方法大多是对类内差异(variance)不加区分,使得模型在训练集上过度拟合。
  3. 创新:提出 deep variational metric learning (DVML) 框架:
    - 刻画类内差异(variance),解开(disentangle)类内不变性(invariance)
    - 产生判别样本(discriminative samples)提高鲁棒性
  4. DVML的网络结构:
    关于“Deep Variational Metric Learning”_第1张图片
    损失函数:
    L = λ 1 L 1 + λ 2 L 2 + λ 3 L 3 + λ 4 L 4 L = \lambda _ { 1 } L _ { 1 } + \lambda _ { 2 } L _ { 2 } + \lambda _ { 3 } L _ { 3 } + \lambda _ { 4 } L _ { 4 } L=λ1L1+λ2L2+λ3L3+λ4L4
    L ( θ , ϕ ; X b ) ≈ 1 2 B ∑ i = 1 B ∑ j = 1 J ( 1 + log ⁡ ( ( σ j ( i ) ) 2 ) − ( μ j ( i ) ) 2 − ( σ j ( i ) ) 2 ) + 1 T B ∑ i = 1 B ∑ t = 1 L log ⁡ p θ ( x ( i ) ∣ z ( i , t ) ) ≜ L 1 + L 2 \begin{aligned} \mathcal { L } \left( \theta , \phi ; \mathbf { X } _ { b } \right) & \approx \frac { 1 } { 2 B } \sum _ { i = 1 } ^ { B } \sum _ { j = 1 } ^ { J } \left( 1 + \log \left( \left( \sigma _ { j } ^ { ( i ) } \right) ^ { 2 } \right) - \left( \mu _ { j } ^ { ( i ) } \right) ^ { 2 } - \left( \sigma _ { j } ^ { ( i ) } \right) ^ { 2 } \right) \\ & + \frac { 1 } { T B } \sum _ { i = 1 } ^ { B } \sum _ { t = 1 } ^ { L } \log p _ { \theta } \left( \mathbf { x } ^ { ( i ) } | \mathbf { z } ^ { ( i , t ) } \right) \\ & \triangleq L _ { 1 } + L _ { 2 } \end{aligned} L(θ,ϕ;Xb)2B1i=1Bj=1J(1+log((σj(i))2)(μj(i))2(σj(i))2)+TB1i=1Bt=1Llogpθ(x(i)z(i,t))L1+L2
    L 1 = 1 2 B ∑ i = 1 B ∑ j = 1 J ( 1 + log ⁡ ( ( σ j ( i ) ) 2 ) − ( μ j ( i ) ) 2 − ( σ j ( i ) ) 2 ) L _ { 1 } = \frac { 1 } { 2 B } \sum _ { i = 1 } ^ { B } \sum _ { j = 1 } ^ { J } \left( 1 + \log \left( \left( \sigma _ { j } ^ { ( i ) } \right) ^ { 2 } \right) - \left( \mu _ { j } ^ { ( i ) } \right) ^ { 2 } - \left( \sigma _ { j } ^ { ( i ) } \right) ^ { 2 } \right) L1=2B1i=1Bj=1J(1+log((σj(i))2)(μj(i))2(σj(i))2)
    L 2 = 1 T B ∑ i = 1 B ∑ t = 1 T ∥ x ( i ) − x ^ ( i , t ) ∥ 2 L _ { 2 } = \frac { 1 } { T B } \sum _ { i = 1 } ^ { B } \sum _ { t = 1 } ^ { T } \left\| \mathbf { x } ^ { ( i ) } - \hat { \mathbf { x } } ^ { ( i , t ) } \right\| _ { 2 } L2=TB1i=1Bt=1Tx(i)x^(i,t)2
    L ( θ , ϕ ; X b ) \mathcal { L } \left( \theta , \phi ; \mathbf { X } _ { b } \right) L(θ,ϕ;Xb)近似表示类内方差模型的损失函数,L1保证类内方差(variance)的分布为各向同性中心高斯分布,L2保证类内方差保持样本特定的信息。
    L 3 = L m ( z ^ ) L _ { 3 } = L _ { \mathrm { m } } ( \hat { \mathbf { z } } ) L3=Lm(z^) 这里 z ^ = z I k + z ^ V \hat { \mathbf { z } } = \mathbf { z } _ { I _ { k } } + \hat { \mathbf { z } } _ { V } z^=zIk+z^V通过 z I k \mathbf { z } _ { I _ { k } } zIk想强调不同的类有不同的类内方差.
    L 4 = L m ( z I ) L _ { 4 } = L _ { \mathrm { m } } \left( \mathbf { z } _ { I } \right) L4=Lm(zI)这里 L 4 L _ { 4 } L4用于约束类内不变性。
baseline methods 表达式
Triplet L m = ∑ i = 1 N max ⁡ ( α + D ( z ( a ) ( i ) , z ( p ) ( i ) ) 2 − D ( z ( a ) ( i ) , z ( n ) ( i ) ) 2 , 0 ) L _ { \mathrm { m } } = \sum _ { i = 1 } ^ { N } \max \left( \alpha + D \left( \mathbf { z } _ { ( a ) } ^ { ( i ) } , \mathbf { z } _ { ( p ) } ^ { ( i ) } \right) ^ { 2 } - D \left( \mathbf { z } _ { ( a ) } ^ { ( i ) } , \mathbf { z } _ { ( n ) } ^ { ( i ) } \right) ^ { 2 } , 0 \right) Lm=i=1Nmax(α+D(z(a)(i),z(p)(i))2D(z(a)(i),z(n)(i))2,0)
N-pair L m = 1 N ∑ i = 1 N log ⁡ ( 1 + ∑ j ≠ i exp ⁡ ( z ( i ) T z + ( j ) − z ( i ) T z + ( i ) ) ) L _ { \mathrm { m } } = \frac { 1 } { N } \sum _ { i = 1 } ^ { N } \log \left( 1 + \sum _ { j \neq i } \exp \left( \mathbf { z } ^ { ( i ) T } \mathbf { z } _ { + } ^ { ( j ) } - \mathbf { z } ^ { ( i ) T } \mathbf { z } _ { + } ^ { ( i ) } \right) \right) Lm=N1i=1Nlog(1+j̸=iexp(z(i)Tz+(j)z(i)Tz+(i)))
Triplet2 with Distance Weighted Sampling L m = ∑ i = 1 N max ⁡ ( α + D ( z ( a ) ( i ) , z ( p ) ( i ) ) − D ( z ( a ) ( i ) , z ( n ) ( i ) ) , 0 ) L _ { \mathrm { m } } = \sum _ { i = 1 } ^ { N } \max \left( \alpha + D \left( \mathbf { z } _ { ( a ) } ^ { ( i ) } , \mathbf { z } _ { ( p ) } ^ { ( i ) } \right) - D \left( \mathbf { z } _ { ( a ) } ^ { ( i ) } , \mathbf { z } _ { ( n ) } ^ { ( i ) } \right) , 0 \right) Lm=i=1Nmax(α+D(z(a)(i),z(p)(i))D(z(a)(i),z(n)(i)),0)

你可能感兴趣的:(关于“Deep Variational Metric Learning”)