非线性模型的最小二乘(LS)近似解

文章目录

  • 1 非线性误差方程
  • 2 非线性模型的LS解
  • 3 实例分析
    • 3.1 问题
    • 3.2 解答

  现实世界中存在大量的非线性模型,并且我们常常需要对模型中的未知参数进行估计。当模型的非线性强度较弱、精度要求合适时,我们可将非线性模型线性化获取其近似解。本文讨论以最小二乘法估计线性化后的模型参数。

1 非线性误差方程

  非线性的观测方程可表示为
L n , 1 = f n , 1 ( X t , 1 ) + Δ n , 1 (1) \mathop{L}\limits_{n,1}=\mathop{f}\limits_{n,1}(\mathop{X}\limits_{t,1})+\mathop{\Delta}\limits_{n,1} \tag{1} n,1L=n,1f(t,1X)+n,1Δ(1)
式中, f ( X ) = ( f 1 ( X ) f 2 ( X ) ⋯ f n ( X ) ) T f(X)=\left( f_{1}(X)f_{2}(X)\cdots f_{n}(X) \right)^{T} f(X)=(f1(X)f2(X)fn(X))T是由n个X的非线性函数组成的 n × 1 n\times1 n×1向量。
  当未知参数(X)、真误差( Δ \Delta Δ)分别用其估值 X ^ \hat{X} X^ V V V代替时,则可得非线性误差方程
V n , 1 = f n , 1 ( X ^ ) − L n , 1 (2) \mathop{V}\limits_{n,1}=\mathop{f}\limits_{n,1}(\hat{X})-\mathop{L}\limits_{n,1} \tag{2} n,1V=n,1f(X^)n,1L(2)
式中, V V V也称为残差向量。

2 非线性模型的LS解

  将非线性模型式 ( 2 ) (2) (2) X 0 X^{0} X0处泰勒展开并取至一次项,得
V = ∂ f ( X ^ ) ∂ X ^ ∣ X ^ = X 0 ∙ δ X ^ − ( L − f ( X 0 ) ) (3) V={\frac{\partial f(\hat{X})}{\partial \hat{X}}\mid_{\hat{X}=X^{0}}}\bullet {\delta\hat{X}}-(L-f(X^{0})) \tag{3} V=X^f(X^)X^=X0δX^(Lf(X0))(3)
若令 l = L − f ( X 0 ) l=L-f(X^{0}) l=Lf(X0),则
V = B ∙ δ X ^ − l (4) V=B\bullet\delta\hat{X}-l \tag{4} V=BδX^l(4)
式中,
B = ( ∂ f 1 ( X ^ ) ∂ X 1 ^ ∣ X ^ = X 0 ∂ f 1 ( X ^ ) ∂ X 2 ^ ∣ X ^ = X 0 ⋯ ∂ f 1 ( X ^ ) ∂ X t ^ ∣ X ^ = X 0 ∂ f 2 ( X ^ ) ∂ X 1 ^ ∣ X ^ = X 0 ∂ f 2 ( X ^ ) ∂ X 2 ^ ∣ X ^ = X 0 ⋯ ∂ f 2 ( X ^ ) ∂ X t ^ ∣ X ^ = X 0 ⋮ ⋮ ⋮ ⋮ ∂ f n ( X ^ ) ∂ X 1 ^ ∣ X ^ = X 0 ∂ f n ( X ^ ) ∂ X 2 ^ ∣ X ^ = X 0 ⋯ ∂ f n ( X ^ ) ∂ X t ^ ∣ X ^ = X 0 ) (5) B=\begin{pmatrix} {\frac{\partial f_1(\hat{X})}{\partial \hat{X_1}}\mid_{\hat{X}=X^{0}}} & {\frac{\partial f_1(\hat{X})}{\partial \hat{X_2}}\mid_{\hat{X}=X^{0}}} & \cdots & {\frac{\partial f_1(\hat{X})}{\partial \hat{X_t}}\mid_{\hat{X}=X^{0}}} \\ {\frac{\partial f_2(\hat{X})}{\partial \hat{X_1}}\mid_{\hat{X}=X^{0}}} & {\frac{\partial f_2(\hat{X})}{\partial \hat{X_2}}\mid_{\hat{X}=X^{0}}} & \cdots & {\frac{\partial f_2(\hat{X})}{\partial \hat{X_t}}\mid_{\hat{X}=X^{0}}}\\ \vdots & \vdots & \vdots & \vdots \\ {\frac{\partial f_n(\hat{X})}{\partial \hat{X_1}}\mid_{\hat{X}=X^{0}}} & {\frac{\partial f_n(\hat{X})}{\partial \hat{X_2}}\mid_{\hat{X}=X^{0}}} & \cdots & {\frac{\partial f_n(\hat{X})}{\partial \hat{X_t}}\mid_{\hat{X}=X^{0}}}\\ \end{pmatrix} \tag{5} B=X1^f1(X^)X^=X0X1^f2(X^)X^=X0X1^fn(X^)X^=X0X2^f1(X^)X^=X0X2^f2(X^)X^=X0X2^fn(X^)X^=X0Xt^f1(X^)X^=X0Xt^f2(X^)X^=X0Xt^fn(X^)X^=X0(5)
称为系数矩阵(设计矩阵)。
  根据最小二乘原理,
δ X ^ = ( B T P B ) − 1 B T P l (6) \delta\hat{X}=(B^TPB)^{-1}B^{T}Pl \tag{6} δX^=(BTPB)1BTPl(6)
式中, P P P为权阵(衡量各个分量对总体的重要程度)且 P = Q − 1 P=Q^{-1} P=Q1。这样,待估参数 X X X的最终估计结果为
X ^ = X 0 + δ X ^ (7) \hat{X}=X^{0}+\delta \hat{X} \tag{7} X^=X0+δX^(7)

3 实例分析

3.1 问题

  已知非线性模型 f ( x ) = a e b x f(x)=ae^{bx} f(x)=aebx(其中 a , b a,b a,b为待估参数),通过观测我们获得了如下表所示的观测结果。现我们用LS估计未知参数 a , b a,b a,b

x 1 2 3 4 5
f(x) 4.20 3.25 2.52 1.95 1.51

3.2 解答

  (1)列立观测方程
  依题意可列如下观测方程(对应式 ( 1 ) (1) (1)
{ f 1 = a e b + Δ 1 f 2 = a e 2 b + Δ 2 f 3 = a e 3 b + Δ 3 f 4 = a e 4 b + Δ 4 f 5 = a e 5 b + Δ 5 (8) \begin{cases} f_1=ae^b+\Delta_1 \\ f_2=ae^{2b}+\Delta_2\\ f_3=ae^{3b}+\Delta_3\\ f_4=ae^{4b}+\Delta_4\\ f_5=ae^{5b}+\Delta_5\\ \end{cases} \tag{8} f1=aeb+Δ1f2=ae2b+Δ2f3=ae3b+Δ3f4=ae4b+Δ4f5=ae5b+Δ5(8)
此时,我们可随机取两组数据(因为未知参数仅有 a , b a,b a,b)带入 f ( x ) = a e b x f(x)=ae^{bx} f(x)=aebx中解得 ( a b ) \begin{pmatrix} a \\ b \\ \end{pmatrix} (ab)的初始值为 ( a 0 b 0 ) = ( 5.4 − 0.3 ) \begin{pmatrix} a^0 \\ b^0 \\ \end{pmatrix}=\begin{pmatrix} 5.4 \\ -0.3 \\ \end{pmatrix} (a0b0)=(5.40.3)

  (2)估值代替得误差方程(对应式 ( 2 ) (2) (2)
{ v 1 = a ^ e b ^ − f 1 v 2 = a ^ e 2 b ^ − f 2 v 3 = a ^ e 3 b ^ − f 3 v 4 = a ^ e 4 b ^ − f 4 v 5 = a ^ e 5 b ^ − f 5 (9) \begin{cases} v_1=\hat{a}e^{\hat{b}}-f_1 \\ v_2=\hat{a}e^{2\hat{b}}-f_2\\ v_3=\hat{a}e^{3\hat{b}}-f_3\\ v_4=\hat{a}e^{4\hat{b}}-f_4\\ v_5=\hat{a}e^{5\hat{b}}-f_5\\ \end{cases} \tag{9} v1=a^eb^f1v2=a^e2b^f2v3=a^e3b^f3v4=a^e4b^f4v5=a^e5b^f5(9)

  (3)将误差方程于 X 0 X^0 X0处泰勒展开(对应式 ( 4 ) (4) (4)
{ v 1 = e b 0 ⋅ δ a + a 0 e b 0 ⋅ δ b − ( f 1 − a 0 e b 0 ) v 2 = e 2 b 0 ⋅ δ a + 2 a 0 e 2 b 0 ⋅ δ b − ( f 2 − a 0 e 2 b 0 ) v 3 = e 3 b 0 ⋅ δ a + 3 a 0 e 3 b 0 ⋅ δ b − ( f 3 − a 0 e 3 b 0 ) v 4 = e 4 b 0 ⋅ δ a + 4 a 0 e 4 b 0 ⋅ δ b − ( f 4 − a 0 e 4 b 0 ) v 5 = e 5 b 0 ⋅ δ a + 5 a 0 e 5 b 0 ⋅ δ b − ( f 5 − a 0 e 5 b 0 ) (10) \begin{cases} v_1=e^{b^0}\cdot\delta a+a^0e^{b^0}\cdot\delta b-(f_1-a^0e^{b^0}) \\ v_2=e^{2b^0}\cdot\delta a+2a^0e^{2b^0}\cdot\delta b-(f_2-a^0e^{2b^0})\\ v_3=e^{3b^0}\cdot\delta a+3a^0e^{3b^0}\cdot\delta b-(f_3-a^0e^{3b^0})\\ v_4=e^{4b^0}\cdot\delta a+4a^0e^{4b^0}\cdot\delta b-(f_4-a^0e^{4b^0})\\ v_5=e^{5b^0}\cdot\delta a+5a^0e^{5b^0}\cdot\delta b-(f_5-a^0e^{5b^0})\\ \end{cases} \tag{10} v1=eb0δa+a0eb0δb(f1a0eb0)v2=e2b0δa+2a0e2b0δb(f2a0e2b0)v3=e3b0δa+3a0e3b0δb(f3a0e3b0)v4=e4b0δa+4a0e4b0δb(f4a0e4b0)v5=e5b0δa+5a0e5b0δb(f5a0e5b0)(10)
带入具体数据,则式 ( 10 ) (10) (10)可化为
V = ( 0.7408 4.0004 0.5488 5.9272 0.4066 6.5864 0.3012 6.5058 0.2231 6.0245 ) ( δ a δ b ) − ( 0.1996 0.2864 0.3245 0.3236 0.3051 ) V=\begin{pmatrix} 0.7408 & 4.0004 \\ 0.5488 & 5.9272 \\ 0.4066 & 6.5864\\ 0.3012 & 6.5058\\ 0.2231 & 6.0245\\ \end{pmatrix} \begin{pmatrix} \delta a\\ \delta b\\ \end{pmatrix}- \begin{pmatrix} 0.1996\\ 0.2864\\ 0.3245\\ 0.3236\\ 0.3051\\ \end{pmatrix} V=0.74080.54880.40660.30120.22314.00045.92726.58646.50586.0245(δaδb)0.19960.28640.32450.32360.3051
那么,依式 ( 6 ) (6) (6)可解得
( δ a δ b ) = ( − 0.005858021 0.049953787 ) \begin{pmatrix} \delta a\\ \delta b\\ \end{pmatrix}= \begin{pmatrix} -0.005858021\\ 0.049953787\\ \end{pmatrix} (δaδb)=(0.0058580210.049953787)
则未知参数最终估值为
( a ^ b ^ ) = ( a 0 b 0 ) + ( δ a δ b ) = ( 5.394141979 − 0.250246213 ) \begin{pmatrix} \hat{a}\\ \hat{b}\\ \end{pmatrix}= \begin{pmatrix} a^0\\ b^0\\ \end{pmatrix}+ \begin{pmatrix} \delta a\\ \delta b\\ \end{pmatrix}= \begin{pmatrix} 5.394141979\\ -0.250246213\\ \end{pmatrix} (a^b^)=(a0b0)+(δaδb)=(5.3941419790.250246213)


  因此,我们求得的非线性模型为 f ( x ) = 5.394141979 e − 0.250246213 x f(x)=5.394141979e^{-0.250246213x} f(x)=5.394141979e0.250246213x

你可能感兴趣的:(我与测绘,最小二乘法,参数估计,非线性模型,泰勒展开,近似解)