Levenberg-Marquardt算法求解单应性矩阵

A. Levenberg-Marquardt算法

待估计的模型参数 x = [ x 1 , x 2 , ⋯   , x n ] T \mathbf{x}=[x_1, x_2, \cdots,x_n]^T x=[x1,x2,,xn]T

误差函数 f i ( x ) f_i( \mathbf{x}) fi(x): 表示第 i i i个观测值与对应的模型的估计值的差

m m m:观测值的个数

目标函数 F ( x ) = 1 2 ∑ i = 1 m ( f i ( x ) ) 2 F(\mathbf{x}) = \frac{1}{2} \displaystyle \sum^{m}_{i=1}(f_i(\mathbf{x}))^2 F(x)=21i=1m(fi(x))2

最小二乘问题可以描述为
x ∗ = a r g m i n x { F ( x ) } \mathbf{x}^* = \underset{\mathbf{x}}{argmin}\{ F(\mathbf{x}) \} x=xargmin{F(x)}
一些用到的变量和公式
f ( x ) = [ f 1 ( x ) , ⋯   , f m ( x ) ] T F ( x ) = 1 2 f T ( x ) f ( x ) = 1 2 ∣ ∣ f ( x ) ∣ ∣ 2 f ′ ( x ) = J ( x ) = [ ∂ f 1 ( x ) ∂ x 1 ⋯ ∂ f 1 ( x ) ∂ x n ⋮ ⋱ ⋮ ∂ f m ( x ) ∂ x 1 ⋯ ∂ f m ( x ) ∂ x n ] F ′ ( x ) = f ′ ( x ) f ( x ) = J T ( x ) f ( x ) \mathbf{f}(\mathbf{x})=[f_1(\mathbf{x}), \cdots,f_m(\mathbf{x})]^T \\ F(\mathbf{x}) = \displaystyle \frac{1}{2} \mathbf{f}^T(\mathbf{x}) \mathbf{f}(\mathbf{x}) = \frac{1}{2} ||\mathbf{f}(\mathbf{x})||^2 \\ \mathbf{f}^{'}(\mathbf{x}) = \mathbf{J}(\mathbf{x}) = \displaystyle \begin{bmatrix} \frac{\partial f_1(\mathbf{x})}{\partial x_1} &\cdots &\frac{\partial f_1(\mathbf{x})}{\partial x_n} \\ \vdots &\ddots &\vdots \\ \frac{\partial f_m(\mathbf{x})}{\partial x_1} &\cdots &\frac{\partial f_m(\mathbf{x})}{\partial x_n} \\ \end{bmatrix} \\ F^{'}(\mathbf{x}) = \mathbf{f}^{'}(\mathbf{x})\mathbf{f}(\mathbf{x}) = \mathbf{J}^T(\mathbf{x})\mathbf{f}(\mathbf{x}) \\ f(x)=[f1(x),,fm(x)]TF(x)=21fT(x)f(x)=21∣∣f(x)2f(x)=J(x)= x1f1(x)x1fm(x)xnf1(x)xnfm(x) F(x)=f(x)f(x)=JT(x)f(x)

f ( x ) \mathbf{f}(\mathbf{x}) f(x) x \mathbf{x} x处一阶泰勒展开
f ( x + h ) ≈ f ( x ) + J ( x ) h where  h  sufficiently small \displaystyle \mathbf{f}(\mathbf{x} + \mathbf{h}) \approx \mathbf{f}(\mathbf{x}) + \mathbf{J}(\mathbf{x}) \mathbf{h}\\ \text{where} \ \mathbf{h}\ \text{sufficiently small} f(x+h)f(x)+J(x)hwhere h sufficiently small
于是有
F ( x + h ) = 1 2 f T ( x + h ) f ( x + h ) ≈ 1 2 ( f ( x ) + J ( x ) h ) T ( f ( x ) + J ( x ) h ) = F ( x ) + h T J T ( x ) f ( x ) + 1 2 h T J T ( x ) J ( x ) h where  h  sufficiently small \displaystyle \begin{align} F(\mathbf{x} + \mathbf{h}) &= \frac{1}{2} \mathbf{f}^T(\mathbf{x} + \mathbf{h}) \mathbf{f}(\mathbf{x}+ \mathbf{h}) \\ &\approx \frac{1}{2} \left( \mathbf{f}(\mathbf{x}) + \mathbf{J}(\mathbf{x}) \mathbf{h} \right)^T \left( \mathbf{f}(\mathbf{x}) + \mathbf{J}(\mathbf{x}) \mathbf{h} \right) \\ &= F(\mathbf{x}) + \mathbf{h}^T \mathbf{J}^T(\mathbf{x}) \mathbf{f}(\mathbf{x}) + \frac{1}{2} \mathbf{h}^T \mathbf{J}^T(\mathbf{x}) \mathbf{J}(\mathbf{x}) \mathbf{h} \\ \end{align} \\ \text{where} \ \mathbf{h}\ \text{sufficiently small} F(x+h)=21fT(x+h)f(x+h)21(f(x)+J(x)h)T(f(x)+J(x)h)=F(x)+hTJT(x)f(x)+21hTJT(x)J(x)hwhere h sufficiently small

L ( h ) = F ( x ) + h T J T ( x ) f ( x ) + 1 2 h T J T ( x ) J ( x ) h L(\mathbf{h}) = F(\mathbf{x}) + \mathbf{h}^T \mathbf{J}^T(\mathbf{x}) \mathbf{f}(\mathbf{x}) + \frac{1}{2} \mathbf{h}^T \mathbf{J}^T(\mathbf{x}) \mathbf{J}(\mathbf{x}) \mathbf{h} L(h)=F(x)+hTJT(x)f(x)+21hTJT(x)J(x)h
每一次的迭代的基本思路是,在 x \mathbf{x} x处求
h l m = argmin h { L ( h ) + P ( μ , h ) } \mathbf{h}_{lm} = \underset{\mathbf{h}}{\text{argmin}} \{L(\mathbf{h}) + P(\mu,\mathbf{h}) \} hlm=hargmin{L(h)+P(μ,h)}
其中 P ( μ , h ) P(\mathbf{\mu,h}) P(μ,h)是惩罚项,是为了使得上式求得的 h l m \mathbf{h}_{lm} hlm偏小些,并且可以通过调大 μ \mu μ来获得更小的 h l m \mathbf{h}_{lm} hlm,或者调小 μ \mu μ来获得更大的 h l m \mathbf{h}_{lm} hlm。常见的 P ( μ , h ) P(\mu,\mathbf{h}) P(μ,h)的形式有:
P ( μ , h ) = 1 2 μ h T h P(\mu, \mathbf{h}) = \frac{1}{2} \mu \mathbf{h}^T\mathbf{h} P(μ,h)=21μhTh

A.1 求解单应性矩阵的LM算法

n = 8 n = 8 n=8 x \mathbf{x} x对应的是单应性矩阵里待解的8个未知量:
z ⋅ [ u v 1 ] = [ x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 1 ] ⋅ [ a b 1 ] ⇒ { u = x 1 a + x 2 b + x 3 x 7 a + x 8 b + 1 v = x 4 a + x 5 b + x 6 x 7 a + x 8 b + 1 z \cdot \begin{bmatrix} u \\ v \\ 1 \\ \end{bmatrix} = \begin{bmatrix} x_1 &x_2 &x_3 \\ x_4 &x_5 &x_6 \\ x_7 &x_8 &1 \\ \end{bmatrix} \cdot \begin{bmatrix} a \\ b \\ 1 \\ \end{bmatrix} \\ \Rightarrow \begin{cases} \displaystyle u = \frac{x_1 a + x_2 b + x_3}{x_7 a + x_8 b + 1} \\ \displaystyle v = \frac{x_4 a + x_5 b + x_6}{x_7 a + x_8 b + 1} \\ \end{cases} z uv1 = x1x4x7x2x5x8x3x61 ab1 u=x7a+x8b+1x1a+x2b+x3v=x7a+x8b+1x4a+x5b+x6
假设有 s s s个点对,那么 m = 2 ⋅ s m=2 \cdot s m=2s,因为一个点对产生两个误差函数,即对于第 i i i个点对,产生
f 2 i − 1 ( x ) = x 1 a i + x 2 b i + x 3 x 7 a i + x 8 b i + 1 − u i f 2 i ( x ) = x 4 a i + x 5 b i + x 6 x 7 a i + x 8 b i + 1 − v i f_{2i-1}(\mathbf{x}) = \frac{x_1 a_i + x_2 b_i + x_3}{x_7 a_i + x_8 b_i + 1} - u_i \\ f_{2i}(\mathbf{x}) = \frac{x_4 a_i + x_5 b_i + x_6}{x_7 a_i + x_8 b_i + 1} - v_i \\ f2i1(x)=x7ai+x8bi+1x1ai+x2bi+x3uif2i(x)=x7ai+x8bi+1x4ai+x5bi+x6vi
对应的一阶导数是
f 2 i − 1 ′ ( x ) = [ a i x 7 a i + x 8 b i + 1 , b i x 7 a i + x 8 b i + 1 , 1 x 7 a i + x 8 b i + 1 , 0 , 0 , 0 , − a i ( x 1 a i + x 2 b i + x 3 ) ( x 7 a i + x 8 b i + 1 ) 2 , − b i ( x 1 a i + x 2 b i + x 3 ) ( x 7 a i + x 8 b i + 1 ) 2 ] T f 2 i ′ ( x ) = [ 0 , 0 , 0 , a i x 7 a i + x 8 b i + 1 , b i x 7 a i + x 8 b i + 1 , 1 x 7 a i + x 8 b i + 1 , − a i ( x 4 a i + x 5 b i + x 6 ) ( x 7 a i + x 8 b i + 1 ) 2 , − b i ( x 4 a i + x 5 b i + x 6 ) ( x 7 a i + x 8 b i + 1 ) 2 ] T f^{'}_{2i-1}(\mathbf{x}) = \left[ \frac{a_i}{x_7 a_i + x_8 b_i + 1},\frac{b_i}{x_7 a_i + x_8 b_i + 1},\frac{1}{x_7 a_i + x_8 b_i + 1},0,0,0,-\frac{a_i(x_1 a_i + x_2 b_i + x_3)}{(x_7 a_i + x_8 b_i + 1)^2},-\frac{b_i(x_1 a_i + x_2 b_i + x_3)}{(x_7 a_i + x_8 b_i + 1)^2} \right]^T \\ f^{'}_{2i}(\mathbf{x}) = \left[ 0,0,0,\frac{a_i}{x_7 a_i + x_8 b_i + 1},\frac{b_i}{x_7 a_i + x_8 b_i + 1},\frac{1}{x_7 a_i + x_8 b_i + 1},-\frac{a_i(x_4 a_i + x_5 b_i + x_6)}{(x_7 a_i + x_8 b_i + 1)^2},-\frac{b_i(x_4 a_i + x_5 b_i + x_6)}{(x_7 a_i + x_8 b_i + 1)^2} \right]^T \\ f2i1(x)=[x7ai+x8bi+1ai,x7ai+x8bi+1bi,x7ai+x8bi+11,0,0,0,(x7ai+x8bi+1)2ai(x1ai+x2bi+x3),(x7ai+x8bi+1)2bi(x1ai+x2bi+x3)]Tf2i(x)=[0,0,0,x7ai+x8bi+1ai,x7ai+x8bi+1bi,x7ai+x8bi+11,(x7ai+x8bi+1)2ai(x4ai+x5bi+x6),(x7ai+x8bi+1)2bi(x4ai+x5bi+x6)]T
这里 P ( μ , h ) P(\mu, \mathbf{h}) P(μ,h)的形式如下所示
P ( μ , h ) = 1 2 μ h T W h P(\mu, \mathbf{h}) = \frac{1}{2} \mu \mathbf{h}^T \mathbf{W} \mathbf{h} P(μ,h)=21μhTWh

W \mathbf{W} W是一个已知的对角矩阵,而且对角线元素都是大于0的。


ψ ( h ) = L ( h ) + 1 2 μ h T W h \psi(\mathbf{h}) = L(\mathbf{h}) + \frac{1}{2} \mu \mathbf{h}^T \mathbf{W} \mathbf{h} ψ(h)=L(h)+21μhTWh
求解上式的最小值,即令
ψ ′ ( h ) = J T ( x ) f ( x ) + ( J T ( x ) J ( x ) + μ W ) h = 0 \psi^{'}(\mathbf{h}) = \mathbf{J}^T(\mathbf{x}) \mathbf{f}(\mathbf{x}) + \left( \mathbf{J}^T(\mathbf{x}) \mathbf{J}(\mathbf{x}) + \mu \mathbf{W} \right) \mathbf{h} = 0 ψ(h)=JT(x)f(x)+(JT(x)J(x)+μW)h=0
于是可得
h l m = − ( A + μ W ) − 1 g A = J T ( x ) J ( x ) g = J T ( x ) f ( x ) \mathbf{h}_{lm} = - \left( \mathbf{A} + \mu \mathbf{W} \right)^{-1} \mathbf{g} \\ \mathbf{A} = \mathbf{J}^T(\mathbf{x}) \mathbf{J}(\mathbf{x}) \\ \mathbf{g} = \mathbf{J}^T(\mathbf{x}) \mathbf{f}(\mathbf{x}) hlm=(A+μW)1gA=JT(x)J(x)g=JT(x)f(x)

整个算法如下所示:

已知: x 0 \mathbf{x}_0 x0, ξ 1 = 1 e − 15 \xi_1=1e^{-15} ξ1=1e15 ξ 2 = 1 e − 15 \xi_2=1e^{-15} ξ2=1e15 k m a x k_{\mathrm{max}} kmax

解法:

begin

k = 0 k = 0 k=0

x = x 0 \mathbf{x} = \mathbf{x}_0 x=x0

A = J T ( x ) J ( x ) ; g = J T ( x ) f ( x ) \mathbf{A} = \mathbf{J}^T(\mathbf{x}) \mathbf{J}(\mathbf{x});\quad \mathbf{g} = \mathbf{J}^T(\mathbf{x}) \mathbf{f}(\mathbf{x}) A=JT(x)J(x);g=JT(x)f(x)

W = d i a g ( [ a 11 , a 22 , ⋯   , a n n ] ) ; μ = 1 ; τ = 0.75 \mathbf{W} = diag([a_{11},a_{22}, \cdots,a_{nn}]); \quad \mu = 1; \quad \tau = 0.75 W=diag([a11,a22,,ann]);μ=1;τ=0.75

found = $ \left(||\mathbf{f}(\mathbf{x})|| < \xi_2 \right)$

while (not found) and ( k < k m a x k < k_{\mathrm{max}} k<kmax)

k = k + 1 k = k + 1 k=k+1

Solve ( A + μ W ) h l m = − g \left( \mathbf{A} + \mu \mathbf{W} \right) \mathbf{h}_{lm} = -\mathbf{g} (A+μW)hlm=g

x n e w = x + h l m \mathbf{x}_{\mathrm{new}} = \mathbf{x} + \mathbf{h}_{lm} xnew=x+hlm

ρ = F ( x ) − F ( x n e w ) L ( 0 ) − L ( h l m ) \rho = \displaystyle \frac{F(\mathbf{x}) - F(\mathbf{x}_{\mathrm{new}})}{L(\mathbf{0}) - L(\mathbf{h}_{lm})} ρ=L(0)L(hlm)F(x)F(xnew)

if ρ > 0.75 \rho > 0.75 ρ>0.75

μ = μ / 2 \mu = \mu /2 μ=μ/2

if μ < τ \mu < \tau μ<τ

μ = 0 \mu = 0 μ=0

elseif ρ < 0.25 \rho < 0.25 ρ<0.25

ν = m i n { m a x { 2 × ( 1 − F ( x n e w ) − F ( x ) h l m T g ) , 2 } , 10 } \nu = \mathrm{min} \left\{ \mathrm{max} \left\{ 2 \times \left( 1 - \displaystyle \frac{F(\mathbf{x}_{\mathrm{new}}) - F(\mathbf{x})}{\mathbf{h}^T_{lm} \mathbf{g}} \right),2 \right\},10 \right\} ν=min{max{2×(1hlmTgF(xnew)F(x)),2},10}

if μ = = 0 \mu == 0 μ==0

B = A − 1 \mathbf{B} = \mathbf{A}^{-1} B=A1

t e m p = m a x { [ b 11 , ⋯   , b n n ] } temp = \mathrm{max}\{ [b_{11},\cdots,b_{nn}] \} temp=max{[b11,,bnn]}

μ = 1 / t e m p ; τ = 1 / t e m p \mu = 1/temp; \quad \tau = 1/temp μ=1/temp;τ=1/temp

ν = ν / 2 \nu = \nu / 2 ν=ν/2

μ = μ ⋅ ν \mu = \mu \cdot \nu μ=μν

if F ( x n e w ) < F ( x ) F(\mathbf{x}_{\mathrm{new}}) < F(\mathbf{x}) F(xnew)<F(x)

x = x n e w \mathbf{x} = \mathbf{x}_{\mathrm{new}} x=xnew

A = J T ( x ) J ( x ) ; g = J T ( x ) f ( x ) \mathbf{A} = \mathbf{J}^T(\mathbf{x}) \mathbf{J}(\mathbf{x});\quad \mathbf{g} = \mathbf{J}^T(\mathbf{x}) \mathbf{f}(\mathbf{x}) A=JT(x)J(x);g=JT(x)f(x)

found = $ \left(||\mathbf{h}_{lm}|| < \xi_1 \ \mathrm{or} \ ||\mathbf{f}(\mathbf{x})|| < \xi_2 \right)$

end

你可能感兴趣的:(workspace,算法,矩阵,机器学习)