最小二乘问题

最小二乘问题

min ⁡ f ( x ) = 1 2 ∑ i = 1 m r i 2 ( x ) = 1 2 r ( x ) T r ( x ) x ∈ R n , m ⩾ n (1) \min f\left( x\right) =\dfrac{1}{2}\sum ^{m}_{i=1}r^{2}_i\left( x\right) =\dfrac{1}{2}r\left( x\right) ^{T}r\left( x\right)\quad x\in \mathbb{R} ^{n},m\geqslant n\tag{1} minf(x)=21i=1mri2(x)=21r(x)Tr(x)xRn,mn(1)
这里 r ( x ) = ( r 1 ( x ) , r 2 ( x ) , ⋯   , r m ( x ) ) T r\left( x\right) =\left( r_{1}\left( x\right) ,r_{2}\left( x\right) ,\cdots ,r_{m}\left( x\right) \right) ^{T} r(x)=(r1(x),r2(x),,rm(x))T称为剩余函数,点 x x x 处剩余函数的值称为剩余量。若 r i ( x ) r_i(x) ri(x)均为线性函数,则问题(1)为线性最小二乘问题,若至少有一个 r i ( x ) r_i(x) ri(x)为非线性函数,则问题(1) 为非线性最小二乘问题。

f(x)的导数

J ( x ) J(x) J(x) r ( x ) r(x) r(x)的Jacobian矩阵
J ( x ) = ∂ r ∂ x = [ ∇ r 1 ( x ) , … , ∇ r m ( x ) ] T ∈ R m × n (2) J\left( x\right) =\dfrac{\partial r}{\partial x}=\left[ \nabla r_{1}\left( x\right) ,\ldots ,\nabla r_{m}\left( x\right) \right] ^{T}\in \mathbb{R} ^{m\times n} \tag{2} J(x)=xr=[r1(x),,rm(x)]TRm×n(2)
f ( x ) f(x) f(x)的梯度为
g ( x ) = ∇ f ( x ) = ∑ i = 1 m r i ( x ) ∇ r i ( x ) = J T ( x ) r ( x ) (3) g\left( x\right) =\nabla f\left( x\right) =\sum ^{m}_{i=1}r_{i}\left( x\right) \nabla r_{i}\left( x\right) =J^{T}\left( x\right) r\left( x\right) \tag{3} g(x)=f(x)=i=1mri(x)ri(x)=JT(x)r(x)(3)
f ( x ) f(x) f(x)的Hesse矩阵为
G ( x ) = ∇ 2 f ( x ) = ∑ i = 1 m ∇ r i ( x ) ∇ r i ( x ) T + ∑ i = 1 m r i ( x ) ∇ 2 r i ( x ) = J T ( x ) J ( x ) + S ( x ) (4) \begin{aligned} G\left( x\right) &=\nabla ^{2}f\left( x\right) =\sum ^{m}_{i=1}\nabla r_{i}\left( x\right) \nabla r_{i}\left( x\right) ^{T}+\sum ^{m}_{i=1}r_{i}\left( x\right) \nabla ^{2}r_{i}\left( x\right) \\ &=J^{T}\left( x\right) J\left( x\right) +S\left( x\right) \end{aligned}\tag{4} G(x)=2f(x)=i=1mri(x)ri(x)T+i=1mri(x)2ri(x)=JT(x)J(x)+S(x)(4)
其中
S ( x ) = ∑ i = 1 m r i ( x ) ∇ 2 r i ( x ) (5) S(x)=\sum ^{m}_{i=1}r_{i}\left( x\right) \nabla ^{2}r_{i}\left( x\right) \tag{5} S(x)=i=1mri(x)2ri(x)(5)
为便于讨论,我们采用以下记号:
J ∗ = J ( x ∗ ) , J k = J ( x k ) S ∗ = S ( x ∗ ) , S k = S ( x k ) J^{\ast}=J(x^{\ast}),\quad J_k=J(x_k) \\ S^{\ast}=S(x^{\ast}),\quad S_k=S(x_k) J=J(x),Jk=J(xk)S=S(x),Sk=S(xk)

最小二乘问题的分类

在点 x ∗ x^{\ast} x处, ∥ S ∗ ∥ \Vert S^{\ast}\Vert S的大小取决于剩余量与问题的非线性程度,对零剩余或线性最小二乘问题, ∥ S ∗ ∥ = 0 \Vert S^{\ast}\Vert=0 S=0,随着剩余量的增大或 e i ( x ) ( i = 1 , ⋯   , m ) e_i(x)(i=1,\cdots,m) ei(x)(i=1,,m)的非线性程度的增强, ∥ S ∗ ∥ \Vert S^{\ast}\Vert S的值变大。根据问题的这种特点,将算法分为小剩余算法和大剩余算法。小剩余算法处理 ∥ S ∗ ∥ \Vert S^{\ast}\Vert S为零或不太大的问题,大剩余算法处理 ∥ S ∗ ∥ \Vert S^{\ast}\Vert S较大的问题。

Newton法解最小二乘问题

f ( x ) = f ( x k ) + ∇ f ( x k ) T ( x − x k ) + 1 2 ( x − x k ) T ∇ 2 f ( x k ) ( x − x k ) + O ( ∥ x − x k ∥ 2 ) f\left( x\right) =f\left( x_{k}\right) +\nabla f\left( x_{k}\right) ^{T}\left( x-x_{k}\right) +\dfrac{1}{2}\left( x-x_{k}\right) ^{T}\nabla ^{2}f\left( x_{k}\right) \left( x-x_{k}\right) +O\left( \left\| x-x_{k}\right\| ^{2}\right) f(x)=f(xk)+f(xk)T(xxk)+21(xxk)T2f(xk)(xxk)+O(xxk2)
使用二阶泰勒展开进行局部近似,这是一个二次型
q ( x ) = f ( x k ) + ∇ f ( x k ) T ( x − x k ) + 1 2 ( x − x k ) T ∇ 2 f ( x k ) ( x − x k ) q\left( x\right) =f\left( x_{k}\right) +\nabla f\left( x_{k}\right) ^{T}\left( x-x_{k}\right) +\dfrac{1}{2}\left( x-x_{k}\right) ^{T}\nabla ^{2}f\left( x_{k}\right) \left( x-x_{k}\right) q(x)=f(xk)+f(xk)T(xxk)+21(xxk)T2f(xk)(xxk)
二次型的极值可以通过令导数为0求得
q ′ ( x ) = ∇ f ( x k ) + ∇ 2 f ( x k ) ( x − x k ) = 0 q'\left( x\right) =\nabla f\left( x_{k}\right) +\nabla ^{2}f\left( x_{k}\right) \left( x-x_{k}\right)=0 q(x)=f(xk)+2f(xk)(xxk)=0
d = x − x k d=x-x_k d=xxk为增量,代入 ∇ f ( x ) , ∇ 2 f ( x ) \nabla f(x),\nabla^2 f(x) f(x),2f(x)
( J k T J k + S k ) d = − J k T r k (6) \left( J_{k}^{T}J_{k}+S_{k}\right) d=-J_{k}^{T}r_{k}\tag{6} (JkTJk+Sk)d=JkTrk(6)
对最小二乘问题, Newton 方法的缺点是每次迭代都要求 S k S_k Sk ,即计算m个 n × n n\times n n×n对称矩阵.显然,对一个算法而言, S k S_k Sk 的计算是一个沉重的负担.解决这个问题的方法是或者在 Newton 方程中忽略 S k S_k Sk ,或者用一阶导数信息近似 S k S_k Sk 。而要忽略 S k S_k Sk ,则应在 r i ( x ) r_i(x) ri(x)接近于0或接近于线性时进行。这就是下面我们要讲的小剩余算法。

Gauss-Newton法

在Newton方程(6)中忽略 S k S_k Sk就得到Gauss-Newton(GN)方法。该方法也可以这样理解,在点 x k x_k xk处线性化剩余函数 r i ( x k + d ) r_i(x_k+d) ri(xk+d),我们得到关于 d d d的线性最小二乘问题
min ⁡ d ∈ R n q k ( d ) = 1 2 ∥ r k + J k d ∥ 2 2 (7) \min_{d\in \mathbb{R}^n}q_k(d)=\dfrac{1}{2}\Vert r_k+J_kd\Vert^2_2\tag{7} dRnminqk(d)=21rk+Jkd22(7)
其中
q k ( d ) = 1 2 ( J k d + r k ) T ( J k d + r k ) = 1 2 d T J k T J k d + d T ( J k T r k ) + 1 2 r k T r k (8) \begin{aligned} q_k(d)&=\dfrac{1}{2}(J_k d+r_k)^T(J_k d+r_k)\\ &= \dfrac{1}{2}d^{T}J_{k}^{T}J_{k}d+d^{T}\left( J_{k}^{T}r_{k}\right) +\dfrac{1}{2}r_{k}^{T}r_{k} \end{aligned}\tag{8} qk(d)=21(Jkd+rk)T(Jkd+rk)=21dTJkTJkd+dT(JkTrk)+21rkTrk(8)
这里 q k ( d ) q_k(d) qk(d)是对 f ( x k + d ) f(x_k+d) f(xk+d)的一种二次近似,它与 f ( x k + d ) f(x_k+d) f(xk+d)的二次Taylor近似的差别在于二次项中少了 S k S_k Sk
问题(7)的极小点 d k d_k dk满足
J k T J k d k = − J k T r k (9) J_{k}^{T}J_{k}d_k=-J_{k}^{T}r_{k}\tag{9} JkTJkdk=JkTrk(9)
式(9)称为Gauss-Newton方程,由(9)式得到的方向 d k d_k dk称为Gauss-Newton方向。

用 Gauss-Newton 方法求解最小二乘问题的算法如下

算法1 (Gauss-Newton 方法求解最小二乘问题)

  1. 给定 x 0 , ε > 0 , k : = 0 x_0,\varepsilon>0, k :=0 x0,ε>0,k:=0
  2. 若终止条件满足,则停止迭代;
  3. J k T J k d = − J k T r k J_{k}^T J_{k} d = - J_k^T r_k JkTJkd=JkTrk d k d_k dk ;
  4. x k + 1 : = x k + α k d k x_{k+1}:= x_k + \alpha_k d_k xk+1:=xk+αkdk ,其中 α k \alpha_k αk是一维搜索结果, k : = k + 1 k := k +1 k:=k+1,转2.

基本Gauss-Newton方法是指 α k = 1 \alpha_k =1 αk=1的Gauss-Newton方法.带线搜索的Gauss-Newton方法称为阻尼Gauss-Newton 方法.

Gauss-Newton方法的优点在于它无须计算 r ( x ) r(x) r(x)的二阶导数.另外,由(3)式和(9)式知
d k T g k = d k T J k T r k = − d k T J k T J k d k = − ∥ J k d k ∥ 2 d_{k}^{T}g_{k}=d_{k}^{T}J_{k}^{T}r_{k}=-d_{k}^{T}J_{k}^{T}J_{k}d_{k}=-\left\| J_{k}d_{k}\right\| ^{2} dkTgk=dkTJkTrk=dkTJkTJkdk=Jkdk2

这说明.当 J k J_k Jk满秩, g k g_k gk非零时, d k d_k dk是下降方向。

定理2(基本Gauss-Newton 方法的局部收敛性)
r i ( x ) ∈ C 2 ( i = 1 , ⋯   , m ) , x ∗ r_i(x)\in C^2(i=1,\cdots,m),x^{\ast} ri(x)C2(i=1,,m),x是最小二乘问题(1)的最优解,且 J ∗ T J ∗ J^{\ast T}J^{\ast} JTJ正定。假设由基本Gauss-Newton法迭代产生的点列 { x k } \{x_k\} {xk}收敛于 x ∗ x^{\ast} x,则当 G ( x ) G(x) G(x) J ( x ) T J ( x ) J(x)^TJ(x) J(x)TJ(x) x ∗ x^{\ast} x的邻域内Lipschitz连续时,有
∥ h k + 1 ∥ ⩽ ∥ ( J ∗ T J ∗ ) − 1 ∥ ∥ S ∗ ∥ ∥ h k ∥ + O ( ∥ h k ∥ 2 ) \left\| h_{k+1}\right\| \leqslant \left\| \left( J^{\ast T}J^{\ast}\right) ^{-1}\right\|\left\|S^{\ast}\right\| \left\| h_{k}\right\| +O\left( \left\| h_{k}\right\| ^{2}\right) hk+1 (JTJ)1 Shk+O(hk2)
其中 h k = x k − x ∗ h_k=x_k-x^{\ast} hk=xkx
证明
因为 f ∈ C 2 f\in C^2 fC2,且 G ( x ) G(x) G(x) x ∗ x^{\ast} x的邻域内Lipschitz连续,当 x k x_k xk充分接近 x ∗ x^\ast x时,由Newton法收敛性的定理证明知
g ( x k + d ) = g k + G k d + O ( ∥ d ∥ 2 ) g\left( x_{k}+d\right) =g_{k}+G_{k}d+O\left( \left\| d\right\| ^{2}\right) g(xk+d)=gk+Gkd+O(d2)
d = − h k d=-h_k d=hk,得
0 = g ∗ = g k − G k h k + O ( ∥ h k ∥ 2 ) 0=g^{\ast }=g_{k}-G_{k}h_{k}+O\left( \left\| h_{k}\right\| ^{2}\right) 0=g=gkGkhk+O(hk2)
将(3)(4)式代入上式得
J k T r k − ( J k T J k + S k ) h k + O ( ∥ h k ∥ 2 ) = 0 (10) J_{k}^{T}r_{k}-\left( J_{k}^{T}J_{k}+S_{k}\right) h_{k}+O\left( \left\| h_{k}\right\| ^{2}\right) =0\tag{10} JkTrk(JkTJk+Sk)hk+O(hk2)=0(10)
因为 J ∗ T J ∗ J^{\ast T}J^{\ast} JTJ正定,当 x k x_k xk充分接近 x ∗ x^* x时, J k T J k J_k^TJ_k JkTJk亦正定,我们用 ( J k T J k ) − 1 (J_k^TJ_k)^{-1} (JkTJk)1左乘(10)式,由(8)式得
− d k − h k − ( J k T J k ) − 1 S k h k + O ( ∥ h k ∥ 2 ) = 0 -d_{k}-h_{k}-\left( J_{k}^{T}J_{k}\right) ^{-1}S_{k}h_{k}+O\left( \left\| h_{k}\right\| ^{2}\right) =0 dkhk(JkTJk)1Skhk+O(hk2)=0
因为
d k + h k = x k + 1 − x k + x k − x ∗ = h k + 1 d_{k}+h_{k}=x_{k+1}-x_{k}+x_{k}-x^{\ast }=h_{k+1} dk+hk=xk+1xk+xkx=hk+1
所以
h k + 1 = − ( J k T J k ) − 1 S k h k + O ( ∥ h k ∥ 2 ) ∥ h k + 1 ∥ ⩽ ∥ ( J k T J k ) − 1 S k ∥ ∥ h k ∥ + O ( ∥ h k ∥ 2 ) ⩽ ∥ ( J k T J k ) − 1 S k − ( J ∗ T J ∗ ) − 1 S ∗ ∥ ∥ h k ∥ + ∥ ( J ∗ T J ∗ ) − 1 ∥ ∥ S ∗ ∥ ∥ h k ∥ + O ( ∥ h k ∥ 2 ) (11) \begin{aligned} h_{k+1}&=-\left( J_{k}^{T}J_{k}\right) ^{-1}S_{k}h_{k}+O\left( \left\| h_{k}\right\| ^{2}\right) \\ \left\| h_{k+1}\right\| &\leqslant \left\| \left( J_{k}^{T}J_{k}\right) ^{-1}S_{k}\right\| \left\| h_{k}\right\| +O\left( \left\| h_{k}\right\| ^{2}\right) \\ &\leqslant \left\| \left( J_{k}^{T}J_{k}\right) ^{-1}S_{k}-\left( J^{\ast T}J^{\ast }\right) ^{-1}S^{\ast }\right\| \left\| h_{k}\right\| +\left\| \left( J^{\ast T}J^{\ast }\right) ^{-1}\right\| \left\| S^{\ast }\right\| \left\| h_{k}\right\| +O\left( \left\| h_{k}\right\| ^{2}\right) \end{aligned}\tag{11} hk+1hk+1=(JkTJk)1Skhk+O(hk2) (JkTJk)1Sk hk+O(hk2) (JkTJk)1Sk(JTJ)1S hk+ (JTJ)1 Shk+O(hk2)(11)
在下面关于 S ( x ) S (x) S(x) ( J ( z ) T J ( z ) ) − 1 (J(z)^TJ(z))^{-1} (J(z)TJ(z))1 x ∗ x^{\ast} x的邻域内Lipschitz连续的证明中,对于任意矩阵 A ( x ) A(x) A(x),我们采用记号 A x = A ( x ) A_x = A ( x ) Ax=A(x).因为 G x G_x Gx J x T J x J_x^TJ_x JxTJx x ∗ x^{\ast} x的邻域中Lipschitz连续,所以存在 β , γ > 0 \beta,\gamma>0 β,γ>0,使得对 x ∗ x^{\ast} x邻域内的任意两点 x , y x , y x,y ,有
∥ G ( x ) − G ( y ) ∥ ⩽ β ∥ x − y ∥ ∥ J ( x ) T J ( x ) − J ( y ) T J ( y ) ∥ ⩽ γ ∥ x − y ∥ \begin{aligned}\left\| G\left( x\right) -G\left( y\right) \right\| &\leqslant \beta \left\| x-y\right\| \\ \left\| J\left( x\right) ^{T}J\left( x\right) -J\left( y\right) ^{T}J\left( y\right) \right\| &\leqslant \gamma \left\| x-y\right\| \end{aligned} G(x)G(y) J(x)TJ(x)J(y)TJ(y) βxyγxy
从而
∥ S ( x ) − S ( y ) ∥ = ∥ G ( x ) − a ( y ) − J ( x ) T J ( x ) + J ( y ) T J ( Y ) ∥ ⩽ ∥ G ( x ) − G ( y ) ∥ + ∥ J ( x ) T J ( x ) − J ( y ) T J ( y ) ∥ ⩽ ( β + γ ) ∥ x − y ∥ \begin{aligned}\left\| S\left( x\right) -S\left( y\right) \right\| &=\left\| G\left( x\right) -a\left( y\right) -J\left( x\right) ^{T}J\left( x\right) +J\left( y\right) ^{T}J\left( Y\right) \right\| \\ &\leqslant \left\| G\left( x\right) -G\left( y\right) \right\| +\left\| J\left( x\right) ^{T}J\left( x\right) -J\left( y\right) ^{T}J\left( y\right) \right\| \\ &\leqslant \left( \beta +\gamma \right) \left\| x-y\right\| \end{aligned} S(x)S(y)= G(x)a(y)J(x)TJ(x)+J(y)TJ(Y) G(x)G(y)+ J(x)TJ(x)J(y)TJ(y) (β+γ)xy
x ∗ x^{\ast} x邻域内的任意点 x x x,由 J ∗ T J ∗ J^{\ast T}J^{\ast} JTJ的正定性知,存在 ξ > 0 \xi >0 ξ>0,使得 ∥ ( J x T J x ) − 1 ∥ ⩽ ξ \lVert(J^T_xJ_x)^{-1}\rVert\leqslant \xi ∥(JxTJx)1ξ,从而
∥ ( J x T J x ) − 1 − ( J y T J y ) − 1 ∥ = ∥ ( J x T J x ) − 1 ( J y T J y − J x T J x ) ( J y T J y ) − 1 ∥ ⩽ ∥ ( J x T J x ) − 1 ∥ ∥ ( J y T J y ) − 1 ∥ ∥ J y T J y − J x T J x ∥ ⩽ γ ξ 2 ∥ x − y ∥ \begin{aligned} \left\| \left( J_{x}^{T}J_{x}\right) ^{-1}-\left( J_{y}^{T}J_{y}\right) ^{-1}\right\| &=\left\| \left( J_{x}^{T}J_{x}\right) ^{-1}\left( J_{y}^{T}J_{y}-Jx^{T}J_{x}\right) \left( J_{y}^{T}Jy\right) ^{-1}\right\| \\ &\leqslant \left\| \left( J_{x}^{T}J_{x}\right) ^{-1}\right\| \left\| \left( J_{y}^{T}J_{y}\right) ^{-1}\right\| \left\| J_{y}^{T}J_{y}-J_{x}^{T}Jx\right\| \\ &\leqslant \gamma \xi ^{2}\left\| x-y\right\| \end{aligned} (JxTJx)1(JyTJy)1 = (JxTJx)1(JyTJyJxTJx)(JyTJy)1 (JxTJx)1 (JyTJy)1 JyTJyJxTJx γξ2xy
所以 S x S_x Sx ( J x T J x ) − 1 (J_x^TJ_x)^{-1} (JxTJx)1也在 x ∗ x^{\ast} x的邻域内Lipschitz连续。
x k x_k xk充分接近 x ∗ x^{\ast} x时,有
∥ ( J k T J k ) − 1 S k − ( J ∗ T J ∗ ) − 1 S ∗ ∥ ⩽ ∥ ( J k T J k ) − 1 S k − ( J k T J k ) − 1 S ∗ ∥ + ∥ ( J k T J k ) − 1 S ∗ − ( J ∗ T J ∗ ) − 1 S ∗ ∥ ⩽ ( β + γ ) ∥ ( J k T J k ) − 1 ∥ ∥ h k ∥ + γ ξ 2 ∥ S ∗ ∥ ∥ h k ∥ ⩽ ( ( β + γ ) ξ + γ ξ 2 ∥ S ∗ ∥ ) ∥ h k ∥ \begin{aligned} &\left\| \left( J_{k}^{T}J_{k}\right) ^{-1}S_{k}-\left( J^{\ast T}J^{\ast }\right) ^{-1}S^{\ast }\right\| \\ &\leqslant \left\| \left( J_{k}^{T}J_{k}\right) ^{-1}S_{k}-\left( J_{k}^{T}J_{k}\right) ^{-1}S^{\ast }\right\| +\left\| \left( J_{k}^{T}J_{k}\right) ^{-1}S^{\ast }-\left( J^{\ast T}J^{\ast }\right) ^{-1}S^{\ast }\right\| \\ &\leqslant \left( \beta +\gamma \right) \left\| \left( J_{k}^{T}J_{k}\right) ^{-1}\right\| \left\| h_{k}\right\| +\gamma \xi ^{2}\left\| S^{\ast }\right\| \left\| h_{k}\right\| \\ &\leqslant \left( \left( \beta +\gamma \right) \xi +\gamma \xi ^{2}\left\| S^{\ast }\right\| \right) \left\| h_{k}\right\| \end{aligned} (JkTJk)1Sk(JTJ)1S (JkTJk)1Sk(JkTJk)1S + (JkTJk)1S(JTJ)1S (β+γ) (JkTJk)1 hk+γξ2Shk((β+γ)ξ+γξ2S)hk
所以
∥ ( J k T J k ) − 1 S k − ( J ∗ T J ∗ ) − 1 S ∗ ∥ ∥ h k ∥ ⩽ ( ( β + γ ) ξ + γ ξ 2 ∥ S ∗ ∥ ) ∥ h k ∥ 2 \left\| \left( J_{k}^{T}J_{k}\right) ^{-1}S_{k}-\left( J^{\ast T}J^{\ast }\right) ^{-1}S^{\ast }\right\| \left\|h_k\right\|\leqslant \left( \left( \beta +\gamma \right) \xi +\gamma \xi ^{2}\left\| S^{\ast }\right\| \right) \left\| h_{k}\right\| ^2 (JkTJk)1Sk(JTJ)1S hk((β+γ)ξ+γξ2S)hk2
将上式代入(11)式可得
∥ h k + 1 ∥ ⩽ ∥ ( J ∗ T J ∗ ) − 1 ∥ ∥ S ∗ ∥ ∥ h k ∥ + O ( ∥ h k ∥ 2 ) \left\| h_{k+1}\right\| \leqslant \left\| \left( J^{\ast T}J^{\ast}\right) ^{-1}\right\|\left\|S^{\ast}\right\| \left\| h_{k}\right\| +O\left( \left\| h_{k}\right\| ^{2}\right) hk+1 (JTJ)1 Shk+O(hk2)
故定理结论成立。

该定理说明,若 x k → x ∗ x_k\to x^{\ast} xkx,基本Gauss-Newton方法有如下两种情形的收敛速度:

  • 二阶收敛速度.若 ∥ S ( x ∗ ) ∥ = 0 \left\|S (x^*)\right\|=0 S(x)0,即在零剩余问题或是线性最小二乘问题的情形,则方法在 x ∗ x^{\ast} x附近具有Newton方法的收敛速度.
  • 线性收敛速度.若 ∥ S ( x ∗ ) ∥ ≠ 0 \left\|S ( x^*)\right\|\neq 0 S(x)=0,则方法的收敛速度是线性的,收敛速度随 S ( x ∗ ) S (x^*) S(x)的增大而变慢.

由此可见,基本Gauss-Newton方法的收敛速度是与 x ∗ x^{\ast} x处剩余量的大小及剩余函数的线性程度有关的,即剩余量越小或剩余函数越接近线性,它的收敛速度就越快;反之就越慢,甚至对剩余量很大或剩余函数的非线性程度很强的问题不收敛.

你可能感兴趣的:(SLAM,算法,人工智能)