算法-二值贝叶斯滤波器概率更新

1. 应用场景
机器人学中有些问题是二值问题,对于这种二值问题的概率评估问题可以用二值贝叶斯滤波器binary Bayes filter来解决的。比如机器人前方有一个门,机器人想判断这个门是开是关。这个二值状态是固定的,并不会随着测量数据变量的改变而改变。就像门一样,不是开就是关。
2. 公式及推导
二值滤波器的目的是为了求状态 x x x的概率,与其置信度是一致的,即置信度越高,概率也越高。置信度与概率的结果可能不一样,置信度可以大于1。
状态 x x x可能同时会受到控制数据和测量数据的影响,但是如果状态是静态的,那么其置信度就只是测量值的函数:
b e l n ( x ) = p ( x ∣ z 1 : n , u 1 : n ) = p ( x ∣ z 1 : n ) ( 1 ) bel_n(x)=p(x|z_{1:n},u_{1:n})=p(x|z_{1:n})\qquad (1) beln(x)=p(xz1:n,u1:n)=p(xz1:n)(1)
首先需要确定状态 x x x的置信度的计算公式,置信度通常以对数差异比的形式实现。状态 x x x的差异比 o d d s odds odds被定义为正状态与反状态的概率值之比:
o d d s ( p ( x = 1 ) ) = p ( x ) p ( − x ) = p ( x ) 1 − p ( x ) ( 2 ) odds(p(x=1))=\cfrac{p(x)}{p(-x)}=\cfrac{p(x)}{1-p(x)}\qquad (2) odds(p(x=1))=p(x)p(x)=1p(x)p(x)(2)
对数差异比是对上式的对数运算:
l ( x ) = l o g ( o d d s ( x = 1 ) ) ( 3 ) l(x)=log(odds(x=1))\qquad (3) l(x)=log(odds(x=1))(3)
那么对于同一个状态具有 n n n次观测数据时,对应的差异比 o d d s ( p ( x = 1 ∣ z [ 1 : n ] ) ) odds(p(x=1|z_{[1:n]})) odds(p(x=1∣z[1:n]))为:
o d d s ( p ( x = 1 ∣ z [ 1 : n ] ) ) = o d d s ( p ( x = 1 ∣ z [ 1 : n − 1 ] ) ) o d d s ( p ( x = 1 ∣ z [ n ] ) ) p ( x = 1 ) p ( x = 0 ) = o d d s ( p ( x = 1 ∣ z [ 1 : n − 1 ] ) ) C ( z [ n ] ) ( 4 ) odds(p(x=1|z_{[1:n]}))\\=odds(p(x=1|z_{[1:n-1]}))odds(p(x=1|z_{[n]}))\cfrac{p(x=1)}{p(x=0)}\\=odds(p(x=1|z_{[1:n-1]}))C(z_{[n]})\qquad (4) odds(p(x=1∣z[1:n]))=odds(p(x=1∣z[1:n1]))odds(p(x=1∣z[n]))p(x=0)p(x=1)=odds(p(x=1∣z[1:n1]))C(z[n])(4)
即只需要在原有概率上乘以常数 C ( z [ n ] ) C(z_{[n]}) C(z[n])即可。
式(4)的推导如下:
当没有观测信息时,状态 x x x的概率 p ( x = 1 ) = 0.5 p(x=1)=0.5 p(x=1)=0.5
p ( x = 1 ) p ( x = 0 ) = 1 \cfrac{p(x=1)}{p(x=0)}=1 p(x=0)p(x=1)=1
根据贝叶斯公式:
p ( x = 1 ∣ z [ 1 : n ] ) = p ( z [ n ] ∣ x = 1 , z [ 1 : n − 1 ] ) p ( x = 1 ∣ x = 1 , z [ 1 : n − 1 ] ) p ( z [ n ] ∣ z [ 1 : n − 1 ] ) = p ( z [ n ] ∣ x = 1 ) p ( x = 1 ∣ x = 1 , z [ 1 : n − 1 ] ) p ( z [ n ] ∣ z [ 1 : n − 1 ] ) p(x=1|z_{[1:n]})=\cfrac{p(z_{[n]}|x=1,z_{[1:n-1]})p(x=1|x=1,z_{[1:n-1]})}{p(z_{[n]}|z_{[1:n-1]})}\\=\cfrac{p(z_{[n]}|x=1)p(x=1|x=1,z_{[1:n-1]})}{p(z_{[n]}|z_{[1:n-1]})} p(x=1∣z[1:n])=p(z[n]z[1:n1])p(z[n]x=1,z[1:n1])p(x=1∣x=1,z[1:n1])=p(z[n]z[1:n1])p(z[n]x=1)p(x=1∣x=1,z[1:n1])
由于 p ( z [ n ] ∣ x = 1 ) = p ( x = 1 ∣ z [ n ] ) p ( z [ n ] ) p ( x = 1 ) p(z_{[n]}|x=1)=\cfrac{p(x=1|z_{[n]})p(z_{[n]})}{p(x=1)} p(z[n]x=1)=p(x=1)p(x=1∣z[n])p(z[n])
p ( x = 1 ∣ z [ 1 : n ] ) = p ( x = 1 ∣ z [ n ] ) p ( z [ n ] ) p ( x = 1 ∣ x = 1 , z [ 1 : n − 1 ] ) p ( s = 1 ) p ( z [ n ] ∣ z [ 1 : n − 1 ] ) p(x=1|z_{[1:n]})=\cfrac{p(x=1|z_{[n]})p(z_{[n]})p(x=1|x=1,z_{[1:n-1]})}{p(s=1)p(z_{[n]}|z_{[1:n-1]})} p(x=1∣z[1:n])=p(s=1)p(z[n]z[1:n1])p(x=1∣z[n])p(z[n])p(x=1∣x=1,z[1:n1])
同理 p ( x = 0 ∣ z [ 1 : n ] ) = p ( x = 0 ∣ z [ n ] ) p ( z [ n ] ) p ( x = 0 ∣ x = 1 , z [ 1 : n − 1 ] ) p ( s = 0 ) p ( z [ n ] ∣ z [ 1 : n − 1 ] ) p(x=0|z_{[1:n]})=\cfrac{p(x=0|z_{[n]})p(z_{[n]})p(x=0|x=1,z_{[1:n-1]})}{p(s=0)p(z_{[n]}|z_{[1:n-1]})} p(x=0∣z[1:n])=p(s=0)p(z[n]z[1:n1])p(x=0∣z[n])p(z[n])p(x=0∣x=1,z[1:n1])
两者相除:
o d d s ( p ( x = 1 ∣ z [ 1 : n ] ) ) = p ( x = 1 ∣ z [ 1 : n ] ) p ( x = 0 ∣ z [ 1 : n ] ) = p ( x = 1 ∣ z [ n ] ) p ( x = 1 ∣ z [ n − 1 ] ) p ( x = 0 ) p ( x = 0 ∣ z [ n ] ) p ( x = 0 ∣ z [ n − 1 ] ) p ( x = 1 ) = o d d s ( p ( x = 1 ∣ z [ 1 : n − 1 ] ) ) o d d s ( p ( x = 1 ∣ z [ n ] ) ) p ( x = 0 ) p ( x = 1 ) = o d d s ( p ( x = 1 ∣ z [ 1 : n − 1 ] ) ) C ( z [ n ] ) odds(p(x=1|z_{[1:n]}))=\cfrac{p(x=1|z_{[1:n]})}{p(x=0|z_{[1:n]})}\\=\cfrac{p(x=1|z_{[n]})p(x=1|z_{[n-1]})p(x=0)}{p(x=0|z_{[n]})p(x=0|z_{[n-1]})p(x=1)}\\=odds(p(x=1|z_{[1:n-1]}))odds(p(x=1|z_{[n]}))\cfrac{p(x=0)}{p(x=1)}\\=odds(p(x=1|z_{[1:n-1]}))C(z_{[n]}) odds(p(x=1∣z[1:n]))=p(x=0∣z[1:n])p(x=1∣z[1:n])=p(x=0∣z[n])p(x=0∣z[n1])p(x=1)p(x=1∣z[n])p(x=1∣z[n1])p(x=0)=odds(p(x=1∣z[1:n1]))odds(p(x=1∣z[n]))p(x=1)p(x=0)=odds(p(x=1∣z[1:n1]))C(z[n])
转化为对数差异比,则在测量数据不断变化的环境下,通过如下公式可以对对数差异比进行更新:
l n = l o g ( o d d s ( p ( x = 1 ∣ z [ 1 : n ] ) ) ) = l o g ( o d d s ( p ( x = 1 ∣ z [ 1 : n − 1 ] ) ) ) + l o g ( C ( z [ n ] ) ) l_n=log(odds(p(x=1|z_{[1:n]})))=log(odds(p(x=1|z_{[1:n-1]}))) + log(C(z_{[n]})) ln=log(odds(p(x=1∣z[1:n])))=log(odds(p(x=1∣z[1:n1])))+log(C(z[n]))
对应更新的概率可以用下式计算:
b e l n ( x ) = 1 − 1 1 + e x p ( l n ) bel_n(x)=1- \cfrac{1}{1+exp{(l_n)}} beln(x)=11+exp(ln)1
至此,即可对二值状态进行概率更新。
注:cartographer用的不是对数差异比,用的是差异比,所以概率计算用的是 b e l n ( x ) = o d d s 1 + o d d s bel_n(x)=\cfrac{odds}{1+odds} beln(x)=1+oddsodds,而且将概率乘法组织成了查询表。

参考文献
https://zhuanlan.zhihu.com/p/74003207
https://zhuanlan.zhihu.com/p/140318042

你可能感兴趣的:(算法,人工智能)