Dynamic Model包含:
在HMM中有 λ = ( π , A , B ) \lambda = (\pi, A, B) λ=(π,A,B),用于表示状态转移矩阵和发射矩阵。
由于Linear Dynamic System和Particle Filter中的隐变量与观测变量连续,状态转移矩阵和发射矩阵不用矩阵A、B表示,表示为:
Z t = g ( Z t − 1 , u , ε ) ↦ A X t = h ( Z t , u , δ ) ↦ B \begin{align} Z_t & = g(Z_{t-1}, u, \varepsilon) \mapsto A \\ X_t & = h(Z_t, u, \delta) \mapsto B \end{align} ZtXt=g(Zt−1,u,ε)↦A=h(Zt,u,δ)↦B
在Kalman Filter中我们假设以上两个公式均为线性,且噪声为Gauss。表示为:
Z t = A ⋅ Z t − 1 + B + ε ε ∽ N ( 0 , Q ) X t = C ⋅ Z t + D + δ δ ∽ N ( 0 , R ) \begin{align} Z_t & = A \cdot Z_{t-1} + B + \varepsilon & \varepsilon \backsim N(0, Q) \\ X_t & = C \cdot Z_t + D + \delta & \delta \backsim N(0, R) \end{align} ZtXt=A⋅Zt−1+B+ε=C⋅Zt+D+δε∽N(0,Q)δ∽N(0,R)
回顾:Kalman Filter通过预测+更新的方式求解Filtering问题
Step1: 求解Prediction问题
P ( Z t ∣ x 1 , … , x t − 1 ) = ∫ Z t − 1 P ( Z t ∣ Z t − 1 ) ⋅ P ( Z t − 1 ∣ x 1 , … , x t − 1 ) d Z t − 1 P(Z_{t} | x_1, \dots, x_{t-1}) = \int_{Z_{t-1}} P(Z_t | Z_{t-1}) \cdot P(Z_{t-1} | x_1, \dots, x_{t-1}) {\rm d}_{Z_{t-1}} P(Zt∣x1,…,xt−1)=∫Zt−1P(Zt∣Zt−1)⋅P(Zt−1∣x1,…,xt−1)dZt−1
Step2: 求解update问题
P ( Z t ∣ x 1 , … , x t ) ∝ P ( X t ∣ Z t ) ⋅ P ( Z t ∣ x 1 , … , x t − 1 ) P(Z_{t} | x_1, \dots, x_t) \propto P(X_t | Z_t) \cdot P(Z_t | x_1, \dots, x_{t-1}) P(Zt∣x1,…,xt)∝P(Xt∣Zt)⋅P(Zt∣x1,…,xt−1)
具体可以通过条件概率相互求解的公式求解。
而在Particle Filter中转移方程非线形非高斯,只能通过采样的方式求解。
由于转移方程非线性非高斯,所以只能采取近似方法求解Filtering问题。这里使用Monte Carlo Method,通过采样求取期望:
P ( Z ∣ X ) → E Z ∣ X [ f ( z ) ] = ∫ Z f ( z ) ⋅ P ( Z ∣ X ) d Z ≈ 1 N ∑ i = 1 N f ( Z ( i ) ) P(Z|X) \rightarrow E_{Z|X}[f(z)] = \int_Z {f(z) \cdot P(Z|X)} {\rm d}Z \approx \frac{1}{N} \sum_{i=1}^N f(Z^{(i)}) P(Z∣X)→EZ∣X[f(z)]=∫Zf(z)⋅P(Z∣X)dZ≈N1i=1∑Nf(Z(i))
其中 Z ( i ) Z^{(i)} Z(i)为样本,且 Z ( 1 ) , Z ( 2 ) , … , Z ( N ) ∽ P ( Z ∣ X ) Z^{(1)}, Z^{(2)}, \dots, Z^{(N)} \backsim P(Z|X) Z(1),Z(2),…,Z(N)∽P(Z∣X)。
已知问题:
E [ f ( Z ) ] = ∫ f ( Z ) p ( Z ) d Z E[f(Z)] = \int f(Z) p(Z) {\rm d}Z E[f(Z)]=∫f(Z)p(Z)dZ
求解方法:
但 p ( Z ) p(Z) p(Z)的分布复杂,无法直接采样,所以我们引入已知分布 q ( Z ) q(Z) q(Z), q ( Z ) q(Z) q(Z)也称为提议分布(Proposed dist):
E [ f ( Z ) ] = ∫ f ( Z ) p ( Z ) d Z = ∫ f ( Z ) ⋅ p ( Z ) q ( Z ) ⋅ q ( Z ) d Z = 1 N ∑ i = 1 N f ( Z ( i ) ) ⋅ p ( Z ) q ( Z ) \begin{align} E[f(Z)] & = \int f(Z) p(Z) {\rm d}Z \\ & = \int f(Z) \cdot \frac{p(Z)}{q(Z)} \cdot q(Z) {\rm d}Z\\ & = \frac{1}{N} \sum_{i=1}^{N} f(Z^{(i)}) \cdot \frac{p(Z)}{q(Z)} \end{align} E[f(Z)]=∫f(Z)p(Z)dZ=∫f(Z)⋅q(Z)p(Z)⋅q(Z)dZ=N1i=1∑Nf(Z(i))⋅q(Z)p(Z)
其中 p ( Z ) q ( Z ) \frac{p(Z)}{q(Z)} q(Z)p(Z)被称为weight,表示为 w ( i ) w^{(i)} w(i),用于表示提议分布与实际分布之间的相似度:
E [ f ( Z ) ] = 1 N ∑ i = 1 N f ( Z ( i ) ) ⋅ w ( i ) E[f(Z)] = \frac{1}{N} \sum_{i=1}^{N} f(Z^{(i)}) \cdot w^{(i)} E[f(Z)]=N1i=1∑Nf(Z(i))⋅w(i)
所以我们通过采样可以求出 f ( Z ( i ) ) f(Z^{(i)}) f(Z(i)),然后我们的目标就是求出对应的 w ( i ) w^{(i)} w(i)。
引入SIS的原因:
推导过程:
已知:
w t ( i ) ∝ P ( Z 1 : t ∣ X 1 : t ) q ( Z 1 : t ∣ X 1 : t ) w_t^{(i)} \propto \frac{P(Z_{1:t} | X_{1:t})}{q(Z_{1:t} | X_{1:t})} wt(i)∝q(Z1:t∣X1:t)P(Z1:t∣X1:t)
分解 P ( Z 1 : t ∣ X 1 : t ) P(Z_{1:t} | X_{1:t}) P(Z1:t∣X1:t),其中将已知量(只由观测变量构成的数据)假设为常数:
P ( Z 1 : t ∣ X 1 : t ) = P ( Z 1 : t , X 1 : t ) P ( X 1 : t ) = 1 C ⋅ P ( X t ∣ Z 1 : t , X 1 : t − 1 ) ⋅ P ( Z t , X 1 : t − 1 ) = 1 C ⋅ P ( X t ∣ Z t ) ⋅ P ( Z t ∣ Z 1 : t − 1 , X 1 : t − 1 ) ⋅ P ( Z 1 : t − 1 , X 1 : t − 1 ) = 1 C ⋅ P ( X t ∣ Z t ) ⋅ P ( Z t ∣ Z t − 1 ) ⋅ P ( Z 1 : t − 1 , X 1 : t − 1 ) = 1 C ⋅ P ( X t ∣ Z t ) ⋅ P ( Z t ∣ Z t − 1 ) ⋅ P ( Z 1 : t − 1 ∣ X 1 : t − 1 ) ⋅ P ( X 1 : t − 1 ) = D C ⋅ P ( X t ∣ Z t ) ⋅ P ( Z t ∣ Z t − 1 ) ⋅ P ( Z 1 : t − 1 ∣ X 1 : t − 1 ) \begin{align} P(Z_{1:t} | X_{1:t}) & = \frac{P(Z_{1:t}, X_{1:t})}{P(X_{1:t})} \\ & = \frac{1}{C} \cdot P(X_t | Z_{1:t}, X_{1:t-1}) \cdot P(Z_t, X_{1:t-1}) \\ & = \frac{1}{C} \cdot P(X_t | Z_{t}) \cdot P(Z_t| Z_{1:t-1}, X_{1:t-1}) \cdot P(Z_{1:t-1}, X_{1:t-1}) \\ & = \frac{1}{C} \cdot P(X_t | Z_{t}) \cdot P(Z_t| Z_{t-1}) \cdot P(Z_{1:t-1}, X_{1:t-1}) \\ & = \frac{1}{C} \cdot P(X_t | Z_{t}) \cdot P(Z_t| Z_{t-1}) \cdot P(Z_{1:t-1}| X_{1:t-1}) \cdot P(X_{1:t-1}) \\ & = \frac{D}{C} \cdot P(X_t | Z_{t}) \cdot P(Z_t| Z_{t-1}) \cdot P(Z_{1:t-1}| X_{1:t-1}) \end{align} P(Z1:t∣X1:t)=P(X1:t)P(Z1:t,X1:t)=C1⋅P(Xt∣Z1:t,X1:t−1)⋅P(Zt,X1:t−1)=C1⋅P(Xt∣Zt)⋅P(Zt∣Z1:t−1,X1:t−1)⋅P(Z1:t−1,X1:t−1)=C1⋅P(Xt∣Zt)⋅P(Zt∣Zt−1)⋅P(Z1:t−1,X1:t−1)=C1⋅P(Xt∣Zt)⋅P(Zt∣Zt−1)⋅P(Z1:t−1∣X1:t−1)⋅P(X1:t−1)=CD⋅P(Xt∣Zt)⋅P(Zt∣Zt−1)⋅P(Z1:t−1∣X1:t−1)
通过以上推导可将 P ( Z 1 : t ∣ X 1 : t ) P(Z_{1:t} | X_{1:t}) P(Z1:t∣X1:t)分解为由 P ( Z 1 : t − 1 ∣ X 1 : t − 1 ) P(Z_{1:t-1}| X_{1:t-1}) P(Z1:t−1∣X1:t−1)组成的公式
分解 q ( Z 1 : t ∣ X 1 : t ) q(Z_{1:t} | X_{1:t}) q(Z1:t∣X1:t):
q ( Z 1 : t ∣ X 1 : t ) = q ( Z t ∣ Z 1 : t − 1 , X 1 : t ) ⋅ q ( Z 1 : t − 1 ∣ X 1 : t ) = q ( Z t ∣ Z 1 : t − 1 , X 1 : t ) ⋅ q ( Z 1 : t − 1 ∣ X 1 : t − 1 ) \begin{align} q(Z_{1:t} | X_{1:t}) & = q(Z_t | Z_{1:t-1}, X_{1:t}) \cdot q(Z_{1:t-1}| X_{1:t}) \\ & = q(Z_t | Z_{1:t-1}, X_{1:t}) \cdot q(Z_{1:t-1}| X_{1:t-1}) \end{align} q(Z1:t∣X1:t)=q(Zt∣Z1:t−1,X1:t)⋅q(Z1:t−1∣X1:t)=q(Zt∣Z1:t−1,X1:t)⋅q(Z1:t−1∣X1:t−1)
结合起来就是:
w t ( i ) ∝ P ( Z 1 : t ∣ X 1 : t ) q ( Z 1 : t ∣ X 1 : t ) ∝ P ( X t ∣ Z t ) ⋅ P ( Z t ∣ Z t − 1 ) ⋅ P ( Z 1 : t − 1 ∣ X 1 : t − 1 ) q ( Z t ∣ Z 1 : t − 1 , X 1 : t ) ⋅ q ( Z 1 : t − 1 ∣ X 1 : t − 1 ) ∝ P ( X t ∣ Z t ) ⋅ P ( Z t ∣ Z t − 1 ) q ( Z t ∣ Z 1 : t − 1 , X 1 : t ) ⋅ w t − 1 ( i ) ∝ P ( X t ∣ Z t ) ⋅ P ( Z t ∣ Z t − 1 ) q ( Z t ∣ Z t − 1 , X 1 : t ) ⋅ w t − 1 ( i ) \begin{align} w_t^{(i)} & \propto \frac{P(Z_{1:t} | X_{1:t})}{q(Z_{1:t} | X_{1:t})} \\ & \propto \frac{P(X_t | Z_{t}) \cdot P(Z_t| Z_{t-1}) \cdot P(Z_{1:t-1}| X_{1:t-1})}{q(Z_t | Z_{1:t-1}, X_{1:t}) \cdot q(Z_{1:t-1}| X_{1:t-1})} \\ & \propto \frac{P(X_t | Z_{t}) \cdot P(Z_t| Z_{t-1})}{q(Z_t | Z_{1:t-1}, X_{1:t})} \cdot w_{t-1}^{(i)} \\ & \propto \frac{P(X_t | Z_{t}) \cdot P(Z_t| Z_{t-1})}{q(Z_t | Z_{t-1}, X_{1:t})} \cdot w_{t-1}^{(i)} \end{align} wt(i)∝q(Z1:t∣X1:t)P(Z1:t∣X1:t)∝q(Zt∣Z1:t−1,X1:t)⋅q(Z1:t−1∣X1:t−1)P(Xt∣Zt)⋅P(Zt∣Zt−1)⋅P(Z1:t−1∣X1:t−1)∝q(Zt∣Z1:t−1,X1:t)P(Xt∣Zt)⋅P(Zt∣Zt−1)⋅wt−1(i)∝q(Zt∣Zt−1,X1:t)P(Xt∣Zt)⋅P(Zt∣Zt−1)⋅wt−1(i)
具体可以表示为一个算法:
条件:t-1时刻的采样已完成 → w t − 1 ( i ) \rightarrow w_{t-1}^{(i)} →wt−1(i)已知。
t时刻:
for i = 1 to N:
Z t ( i ) ∽ q ( Z t ∣ Z t − 1 , X 1 : t ) Z_t^{(i)} \backsim q(Z_t | Z_{t-1}, X_{1:t}) Zt(i)∽q(Zt∣Zt−1,X1:t) // 采样
w t ( i ) ∝ P ( X t ∣ Z t ) ⋅ P ( Z t ∣ Z t − 1 ) q ( Z t ∣ Z t − 1 , X 1 : t ) ⋅ w t − 1 ( i ) w_t^{(i)} \propto \frac{P(X_t | Z_{t}) \cdot P(Z_t| Z_{t-1})}{q(Z_t | Z_{t-1}, X_{1:t})} \cdot w_{t-1}^{(i)} wt(i)∝q(Zt∣Zt−1,X1:t)P(Xt∣Zt)⋅P(Zt∣Zt−1)⋅wt−1(i) // 计算
endw t ( i ) w_t^{(i)} wt(i)要归一化, ∑ i = 1 N w t ( i ) \sum_{i=1}^{N} w_t^{(i)} ∑i=1Nwt(i)
但是通过SIS直接求解有一个问题: w t ( i ) w_t^{(i)} wt(i)的权值会退化——有的大有的小,随着维度上升,可能会出现如: w t ( 1 ) → 1 w_t^{(1)} \rightarrow 1 wt(1)→1但 w t ( N ) → 0 w_t^{(N)} \rightarrow 0 wt(N)→0的情况。解决方案有:
这里介绍一种最简单的重采样方法。
倘若第一遍的采样结果为第二列:
数据编号 | 权重(weight) | cdf | |
---|---|---|---|
x ( 1 ) x^{(1)} x(1) | 0.1 | 0.1 | 0.1 |
x ( 2 ) x^{(2)} x(2) | 0.1 | 0.1 | 0.2 |
x ( 3 ) x^{(3)} x(3) | 0.8 | 0.8 | 1 |
我们将权重假设为当前数据的概率,通过权重建立概率密度函数,并求出其分布函数,即可通过分段函数进行采样。
这样的优点是可以将数据集中在权重大的地方。
结合:重要性采样方法+SIS+Resampling,就是简单的粒子滤波求解方案:Basic Particle Filter
Particle Filter整体就是通过每个时刻的采样与迭代地计算权重,通过Monte Carlo方法预测的方法。
根据16.2.2已知迭代公式为:
w t ( i ) ∝ P ( X t ∣ Z t ) ⋅ P ( Z t ∣ Z t − 1 ) q ( Z t ∣ Z t − 1 , X 1 : t ) ⋅ w t − 1 ( i ) w_t^{(i)} \propto \frac{P(X_t | Z_{t}) \cdot P(Z_t| Z_{t-1})}{q(Z_t | Z_{t-1}, X_{1:t})} \cdot w_{t-1}^{(i)} wt(i)∝q(Zt∣Zt−1,X1:t)P(Xt∣Zt)⋅P(Zt∣Zt−1)⋅wt−1(i)
其中我们令 q ( Z t ∣ Z t − 1 , X 1 : t ) q(Z_t | Z_{t-1}, X_{1:t}) q(Zt∣Zt−1,X1:t)为用于采样的分布,我们假设采样的分布就是状态转移函数:
q ( Z t ∣ Z t − 1 , X 1 : t ) = p ( Z t ∣ Z t − 1 ( i ) ) q(Z_t | Z_{t-1}, X_{1:t}) = p(Z_t | Z_{t-1}^{(i)}) q(Zt∣Zt−1,X1:t)=p(Zt∣Zt−1(i))
可以简化计算,算法可以总结为"generate and test":
generate表示采样:采样的方式变成了:
Z t ( i ) ∽ q ( Z t ∣ Z t − 1 , X 1 : t ) ⟹ Z t ( i ) ∽ p ( Z t ∣ Z t − 1 ( i ) ) Z_t^{(i)} \backsim q(Z_t | Z_{t-1}, X_{1:t}) \implies Z_t^{(i)} \backsim p(Z_t | Z_{t-1}^{(i)}) Zt(i)∽q(Zt∣Zt−1,X1:t)⟹Zt(i)∽p(Zt∣Zt−1(i))
test表示通过权重的迭代计算进行预测:变成了:
w t ( i ) ∝ P ( X t ∣ Z t ( i ) ) ⋅ w t − 1 ( i ) w_t^{(i)} \propto P(X_t | Z_{t}^{(i)}) \cdot w_{t-1}^{(i)} wt(i)∝P(Xt∣Zt(i))⋅wt−1(i)
上面的方法总结下来就是:SIR Filter(Sampling-Importance-Resampling)——SIS + Resampling + ( q ( Z t ∣ Z t − 1 , X 1 : t ) = p ( Z t ∣ Z t − 1 ( i ) ) q(Z_t | Z_{t-1}, X_{1:t}) = p(Z_t | Z_{t-1}^{(i)}) q(Zt∣Zt−1,X1:t)=p(Zt∣Zt−1(i)))