卡尔曼滤波(Kalman Filter)能用于各种状态的预测(温度、湿度、距离等可量化值),并基于测量情况对预测结果进行校正。
卡尔曼滤波主要基于两组数据:
因为各种噪声的存在,预测误差和测量误差都不可忽略,两者的大小共同决定了相信预测多一些还是相信测量多一些。
假设FK用于对位置的预测,那么我们要预测变量有二:位置和速度。
本文把公式的推导分为6个步骤:
最终得出卡尔曼滤波的5个公式。
1. 状态预测
在当前k时刻,p和v分别表示位置(position)和速度(velocity):
x k ^ = ( p k , v k ) \hat{x_k} = (p_k, v_k) xk^=(pk,vk)
每个预测的元素之间都会相互关联,相互间的影响也会造成预测的不确定性。我们用协方差矩阵(covarance matrix)表示:
s k = [ ∑ p p ∑ p v ∑ v p ∑ v v ] s_k = \begin{bmatrix} \sum_{pp} & \sum_{pv} \\ \sum_{vp} & \sum_{vv} \end{bmatrix} sk=[∑pp∑vp∑pv∑vv]
我们从k时刻预测k+1时刻,假设没有加速度,那么位置和速度的转换公式如下:
{ p k + 1 = p k + △ t ⋅ v k v k + 1 = 0 + 1 ⋅ v k \left\{ \begin{aligned} &p_{k+1} = p_k + \triangle t \cdot v_k\\ &v_{k+1} = 0+1 \cdot v_k \end{aligned} \right. {pk+1=pk+△t⋅vkvk+1=0+1⋅vk
用状态转移矩阵 F k + 1 F_{k+1} Fk+1来表示转换矩阵,则公式可简写为:
F k + 1 = [ 1 △ t 0 1 ] , x k ^ = [ p k v k ] F_{k+1} = \begin{bmatrix} 1 & \triangle t \\ 0 & 1 \end{bmatrix} ,\ \hat{x_k} = \begin{bmatrix} p_k \\ v_k \end{bmatrix} Fk+1=[10△t1], xk^=[pkvk]
x k + 1 ^ = F k + 1 ⋅ x k ^ \hat{x_{k+1}} = F_{k+1}\cdot \hat{x_k} xk+1^=Fk+1⋅xk^
2. 协方差预测
根据协方差矩阵的计算公式:
C o v ( F k + 1 x k ^ ) = F k + 1 C o v ( x k ^ ) F k + 1 T Cov(F_{k+1}\hat{x_k})=F_{k+1}Cov(\hat{x_k})F_{k+1}^T Cov(Fk+1xk^)=Fk+1Cov(xk^)Fk+1T
可得出整体的转换结果为:
{ x k + 1 ^ = F k + 1 ⋅ x k ^ s k + 1 = F k + 1 ⋅ s k ⋅ F k + 1 T \left\{ \begin{aligned} & \hat{x_{k+1}} = F_{k+1}\cdot \hat{x_k} \\ & s_{k+1} = F_{k+1}\cdot s_k\cdot F_{k+1}^T \end{aligned} \right. {xk+1^=Fk+1⋅xk^sk+1=Fk+1⋅sk⋅Fk+1T
3. 已知外部因素
原本对状态的预测为:
x k + 1 ^ = { p k + 1 = p k + △ t ⋅ v k v k + 1 = 0 + 1 ⋅ v k \hat{x_{k+1}} = \left\{ \begin{aligned} &p_{k+1} = p_k + \triangle t \cdot v_k\\ &v_{k+1} = 0+1 \cdot v_k \end{aligned} \right. xk+1^={pk+1=pk+△t⋅vkvk+1=0+1⋅vk
我们加上加速度(外力)的因素:
x k + 1 ^ = { p k + 1 = p k + △ t ⋅ v k + 1 2 a △ t 2 v k + 1 = 0 + 1 ⋅ v k + a △ t x k + 1 ^ = F k + 1 ⋅ x k ^ + [ 1 2 △ t 2 △ t ] ⋅ a = F k + 1 ⋅ x k ^ + B k ⋅ u k B k = [ 1 2 △ t 2 △ t ] , u k = a \hat{x_{k+1}} = \left\{ \begin{aligned} &p_{k+1} = p_k + \triangle t \cdot v_k + \frac{1}{2} a \triangle t^2\\ &v_{k+1} = 0+1 \cdot v_k + a \triangle t \end{aligned} \right. \\ \ \\ \begin{aligned} \hat{x_{k+1}} &= F_{k+1} \cdot \hat{x_k} + \begin{bmatrix} \frac{1}{2}\triangle t^2 \\ \triangle t \end{bmatrix} \cdot a \\ &=F_{k+1} \cdot \hat{x_k} +B_k \cdot u_k \end{aligned} \\ \ \\ B_k = \begin{bmatrix} \frac{1}{2}\triangle t^2 \\ \triangle t \end{bmatrix}, u_k = a xk+1^=⎩⎨⎧pk+1=pk+△t⋅vk+21a△t2vk+1=0+1⋅vk+a△t xk+1^=Fk+1⋅xk^+[21△t2△t]⋅a=Fk+1⋅xk^+Bk⋅uk Bk=[21△t2△t],uk=a
其中, B k B_k Bk称控制矩阵,表示外部因素如何控制状态转移; u k u_k uk称控制向量,表示外部因素影响的大小。
4. 未知外部因素
除了已知的,可用公式表示的外部因素(如外力),还有很多未知或难以测量的因素,这些未知因素会造成预测误差,我们统称为噪声。
噪声的大小根本上影响预测状态的协方差,因此以 Q k + 1 Q_{k+1} Qk+1加进协方差中:
s k + 1 = F k + 1 ⋅ s k ⋅ F k + 1 T + Q k + 1 s_{k+1} = F_{k+1}\cdot s_k\cdot F_{k+1}^T + Q_{k+1} sk+1=Fk+1⋅sk⋅Fk+1T+Qk+1
综上,单纯预测的状态预测方程为:
{ x k + 1 ^ = F k + 1 ⋅ x k ^ + B k ⋅ u k s k + 1 = F k + 1 ⋅ s k ⋅ F k + 1 T + Q k + 1 \left\{ \begin{aligned} & \hat{x_{k+1}} = F_{k+1}\cdot \hat{x_k} + B_k\cdot u_k \\ & s_{k+1} = F_{k+1}\cdot s_k\cdot F_{k+1}^T + Q_{k+1} \end{aligned} \right. {xk+1^=Fk+1⋅xk^+Bk⋅uksk+1=Fk+1⋅sk⋅Fk+1T+Qk+1
这也是KF的五个公式中的前两个。
5. 预测空间映射至测量空间
预测的维度和测量维度可能不一致,需要通过矩阵的相乘,把预测值的维度映射到测量值的维度,才能用测量值对预测进行校准。
H k + 1 H_{k+1} Hk+1取决于预测和测量的维度的差别,通常仅用于维度的映射,不做数值大小的改变。
x μ ^ = H k + 1 ⋅ x k + 1 ^ s μ = H k + 1 ⋅ s k + 1 ⋅ H k + 1 T \hat{x_\mu} = H_{k+1} \cdot \hat{x_{k+1}} \\ s_\mu = H_{k+1} \cdot s_{k+1} \cdot H_{k+1}^T xμ^=Hk+1⋅xk+1^sμ=Hk+1⋅sk+1⋅Hk+1T
6. 校正
预测值与测量值都存在误差,且都服从正态分布。
令预测值~ N ( μ 0 , σ 0 ) N(\mu_0, \sigma_0) N(μ0,σ0),测量值~ N ( μ 1 = z k + 1 , σ 1 = R k + 1 ) N(\mu_1=z_{k+1},\sigma_1=\sqrt{R_{k+1}}) N(μ1=zk+1,σ1=Rk+1)。
要通过两个正态分布取得最合适的结果,我们要取两个正态分布的交集。
两个正态分布的交集也服从正态分布,且等于两者的积:
N ( μ ′ , σ ′ ) = N ( μ 0 , σ 0 ) ⋅ N ( μ 1 , σ 1 ) = 1 2 π σ ′ e − ( x − μ ′ ) 2 2 σ ′ 2 N(\mu', \sigma')=N(\mu_0, \sigma_0)\cdot N(\mu_1, \sigma_1) = \frac{1}{\sqrt{2\pi}\sigma'}e^{-\frac{(x-\mu')^2}{2\sigma'^2}} N(μ′,σ′)=N(μ0,σ0)⋅N(μ1,σ1)=2πσ′1e−2σ′2(x−μ′)2
(两个高斯函数的乘积可参考:两个高斯分布相乘结果是什么?)
可得:
{ μ ′ = μ 0 + σ 0 2 ( μ 1 − μ 0 ) σ 0 2 + σ 1 2 σ ′ 2 = σ 0 2 − σ 0 4 σ 0 2 + σ 1 2 \left\{ \begin{aligned} & \mu'= \mu_0+\frac{\sigma_0^2(\mu_1-\mu_0)}{\sigma_0^2+\sigma_1^2}\\ & \sigma'^2 = \sigma_02 - \frac{\sigma_0^4}{\sigma_0^2+\sigma_1^2} \end{aligned} \right. ⎩⎪⎪⎪⎨⎪⎪⎪⎧μ′=μ0+σ02+σ12σ02(μ1−μ0)σ′2=σ02−σ02+σ12σ04
定义K简化公式:
K ≡ σ 0 2 σ 0 2 + σ 1 2 { μ ′ = μ 0 + K ( μ 1 − μ 0 ) σ ′ 2 = σ 0 2 − K σ 0 2 K \equiv \frac{\sigma_{0}^{2}}{\sigma_{0}^{2}+\sigma_{1}^{2}} \\ \ \\ \left\{\begin{array}{l} \mu^{\prime}=\mu_{0}+K\left(\mu_{1}-\mu_{0}\right) \\ \sigma^{\prime 2}=\sigma_{0}^{2}-K \sigma_{0}^{2} \end{array}\right. K≡σ02+σ12σ02 {μ′=μ0+K(μ1−μ0)σ′2=σ02−Kσ02
因为预测状态和协方差分别代表了正态分布的均值和方差,测量值也是如此,所以两者可以等价起来(注意这里的 σ ′ 2 \sigma'^2 σ′2是方差):
{ ( μ 0 , σ 0 2 ) = ( x μ ^ , s μ ) = ( H k + 1 x ^ k + 1 , H k + 1 ⋅ S k + 1 ⋅ H k + 1 ) ( μ 1 , σ 1 2 ) = ( z k + 1 , R k + 1 ) \left\{ \begin{aligned} & (\mu_0, \sigma_0^2) = (\hat{x_\mu}, s_\mu) = (H_{k+1} \hat{x}_{k+1},H_{k+1} \cdot S_{k+1} \cdot H_{k+1}) \\ & (\mu_1, \sigma_1^2) = (z_{k+1}, R_{k+1}) \end{aligned} \right. {(μ0,σ02)=(xμ^,sμ)=(Hk+1x^k+1,Hk+1⋅Sk+1⋅Hk+1)(μ1,σ12)=(zk+1,Rk+1)
把他们代入上式 μ ′ \mu' μ′和 σ ′ 2 \sigma'^2 σ′2中:
{ μ ′ = H k + 1 x k + 1 ^ + K ( z k + 1 − H k + 1 x k + 1 ^ ) σ ′ 2 = H k + 1 ⋅ S k + 1 ⋅ H k + 1 − K H k + 1 ⋅ S k + 1 ⋅ H k + 1 K = H k + 1 ⋅ S k + 1 ⋅ H k + 1 ( H k + 1 ⋅ S k + 1 ⋅ H k + 1 + R k + 1 ) − 1 \left\{ \begin{aligned} &\mu'= H_{k+1}\hat{x_{k+1}}+K(z_{k+1}-H_{k+1}\hat{x_{k+1}})\\ &\sigma'^2= H_{k+1} \cdot S_{k+1} \cdot H_{k+1} - K H_{k+1} \cdot S_{k+1} \cdot H_{k+1} \end{aligned} \right. \ \\ \ \\ K=H_{k+1} \cdot S_{k+1} \cdot H_{k+1}(H_{k+1} \cdot S_{k+1} \cdot H_{k+1} +R_{k+1})^{-1} {μ′=Hk+1xk+1^+K(zk+1−Hk+1xk+1^)σ′2=Hk+1⋅Sk+1⋅Hk+1−KHk+1⋅Sk+1⋅Hk+1 K=Hk+1⋅Sk+1⋅Hk+1(Hk+1⋅Sk+1⋅Hk+1+Rk+1)−1
我们把卡尔曼滤波最终求得的状态和协方差定义为: x k + 1 ′ ^ \hat{x_{k+1}'} xk+1′^和 s k + 1 ′ s_{k+1}' sk+1′。
正如预测状态要通过映射矩阵 H k + 1 H_{k+1} Hk+1映射到高斯分布中,最终的结果 x k + 1 ′ ^ \hat{x_{k+1}'} xk+1′^和 s k + 1 ′ s_{k+1}' sk+1′也同样要进行映射,所以:
{ H k + 1 x k + 1 ′ ^ = μ ′ = H k + 1 x k + 1 ^ + K ( z k + 1 − H k + 1 x k + 1 ^ ) H k + 1 s k + 1 ′ H k + 1 T = σ ′ 2 = H k + 1 ⋅ S k + 1 ⋅ H k + 1 − K H k + 1 ⋅ S k + 1 ⋅ H k + 1 \left\{ \begin{aligned} &H_{k+1}\hat{x_{k+1}'} = \mu' = H_{k+1}\hat{x_{k+1}}+K(z_{k+1}-H_{k+1}\hat{x_{k+1}})\\ &H_{k+1} s_{k+1}'H_{k+1}^T = \sigma'^2= H_{k+1} \cdot S_{k+1} \cdot H_{k+1} - K H_{k+1} \cdot S_{k+1} \cdot H_{k+1} \end{aligned} \right. {Hk+1xk+1′^=μ′=Hk+1xk+1^+K(zk+1−Hk+1xk+1^)Hk+1sk+1′Hk+1T=σ′2=Hk+1⋅Sk+1⋅Hk+1−KHk+1⋅Sk+1⋅Hk+1
两边同时约掉 H k + 1 H_{k+1} Hk+1和 H k + 1 T H_{k+1}^T Hk+1T后,最终得到的卡尔曼滤波的五个公式就是:
{ x ^ k + 1 = F k + 1 ⋅ x ^ k + B k ⋅ u k s k + 1 = F k + 1 ⋅ s k ⋅ F k + 1 T + Q k + 1 K ′ = S k + 1 ⋅ H k + 1 T ⋅ ( H k + 1 ⋅ S k + 1 ⋅ H k + 1 T + R k + 1 ) − 1 { x ^ k + 1 ′ = x ^ k + 1 + K ′ ( z k + 1 − H k + 1 ⋅ x ^ k + 1 ) s k + 1 ′ = s k + 1 − K ′ ⋅ H k + 1 ⋅ s k + 1 \left\{ \begin{aligned} &\hat{x}_{k+1}=F_{k+1} \cdot \hat{x}_{k}+B_{k} \cdot u_{k} \\ &\boldsymbol{s}_{k+1}=\boldsymbol{F}_{k+1} \cdot \boldsymbol{s}_{k} \cdot \boldsymbol{F}_{k+1}^{T}+Q_{k+1} \end{aligned} \right. \\ \ \\ K^{\prime}=S_{k+1} \cdot H_{k+1}^{T} \cdot\left(H_{k+1} \cdot S_{k+1} \cdot H_{k+1}^{T}+R_{k+1}\right)^{-1} \\ \ \\ \left\{ \begin{aligned} &\hat{\boldsymbol{x}}_{k+1}^{\prime}=\hat{\boldsymbol{x}}_{k+1}+K^{\prime}\left(z_{k+1}-H_{k+1} \cdot \hat{\boldsymbol{x}}_{k+1}\right) \\ &\boldsymbol{s}_{k+1}^{\prime}=s_{k+1}-K^{\prime} \cdot H_{k+1} \cdot s_{k+1} \end{aligned} \right. {x^k+1=Fk+1⋅x^k+Bk⋅uksk+1=Fk+1⋅sk⋅Fk+1T+Qk+1 K′=Sk+1⋅Hk+1T⋅(Hk+1⋅Sk+1⋅Hk+1T+Rk+1)−1 {x^k+1′=x^k+1+K′(zk+1−Hk+1⋅x^k+1)sk+1′=sk+1−K′⋅Hk+1⋅sk+1
其中,前两个公式为预测(Predict),公式三为增益(Gain),后两个公式为校正(Correction)。
总结
当没有传感器(Sensor)做测量并校正预测结果时,FL就只用了前两个公式,状态转移矩阵 F k + 1 F_{k+1} Fk+1是根据经验设定的。
当有一个或多个传感器工作,则FL也会使用后三个公式做校准。
从后两个公式看出,最终的状态 x ^ k + 1 ′ \hat{\boldsymbol{x}}_{k+1}^{\prime} x^k+1′和方差 s k + 1 ′ \boldsymbol{s}_{k+1}^{\prime} sk+1′,是以预测的状态 x ^ k + 1 \hat{\boldsymbol{x}}_{k+1} x^k+1和方差 s k + 1 s_{k+1} sk+1为起点,加上或减去增益 K ′ K' K′乘以某个矩阵。
公式4增益后的 ( z k + 1 − H k + 1 ⋅ x ^ k + 1 ) \left(z_{k+1}-H_{k+1} \cdot \hat{\boldsymbol{x}}_{k+1}\right) (zk+1−Hk+1⋅x^k+1),可以看作测量值均值和映射后预测值的距离,即测量方差 R k + 1 R_{k+1} Rk+1越小,增益 K ′ K' K′越大, x ^ k + 1 ′ \hat{\boldsymbol{x}}_{k+1}^{\prime} x^k+1′向 z k + 1 z_{k+1} zk+1偏移得越多。
增益 K ′ K' K′的大小取决于预测和测量各自的方差的大小。也正如文初所说的,预测误差和测量误差都不可忽略,两者的大小共同决定了相信预测多一些还是相信测量多一些。