HMM-前向后向算法

Flag Counter

作者:金良([email protected]) csdn博客: http://blog.csdn.net/u012176591

状态转移概率分布矩阵

A=0.40.30.40.10.40.30.50.30.3

观测概率分布矩阵
B=0.50.40.70.50.60.3

初始概率分布
π=[0.20.40.4]

αt(i)=P(o1,o2,,ot,it=qi|λ)
βt(i)=P(ot+1,ot+2,,oT|it=qi,λ)
P(O|λ)=Ni=1Nj=1αt(i)aijbj(ot+1)βt+1(j)

前向算法

def frontforward(A,B,PI,O,t):  
    # 0<=t
    # n是状态数,m是观测数
    n,m = np.shape(B)
    T = np.shape(O)[0]
    if 0< t < T: # 当t小于等于0时,按t等于0计算
        front = frontforward(A,B,PI,O,t-1)
        return [np.dot(front,A[:,i])*B[i][O[t]] for i in range(n)]
    elif t <= 0:
        return [PI[i]*B[i][O[t]] for i in range(n)]
    else: #当t大于等于T时,按t等于length计算
        return np.sum(frontforward(A,B,PI,O,t-1))

后向算法

def backforward(A,B,PI,O,t):
    # 0<=t<=T-1时输出向量,t=-1时输出最终概率
    n,m = np.shape(B)
    T = np.shape(O)[0]
    if 0<=t<=T-2:
        back = backforward(A,B,PI,O,t+1)
        return [np.dot(back*B[:,O[t+1]],A[i,:]) for i in range(n)]
    elif t >= T-1:
        return [1]*n
    else:
        return np.sum(backforward(A,B,PI,O,0)*PI*B[:,O[0]])

概率计算

def backfront(A,B,PI,O,t):
    n,m = np.shape(B)
    T = np.shape(O)[0]
    if 0<=t<=T-2:
        return np.sum([frontforward(A,B,PI,O,t)[i]*backforward(A,B,PI,O,t+1)[j]*A[i,j]*B[j,O[t+1]] for i in range(n) for j in range(n)])
    elif t >= T-1:
        return frontforward(A,B,PI,O,T)
    else:
        return backforward(A,B,PI,O,-1)

测试

print '前向算法frontforward的测试:'
for t in range(6):
    print 't=%d:'%t,frontforward(A,B,PI,O,t)

前向算法frontforward的测试:
t=0: [0.10000000000000001, 0.16000000000000003, 0.27999999999999997]
t=1: [0.10000000000000001, 0.094799999999999995, 0.054599999999999996]
t=2: [0.04514, 0.02572, 0.066373999999999989]
t=3: [0.026160799999999998, 0.020828519999999996, 0.015059459999999998]
t=4: [0.011368329999999999, 0.0092791955999999981, 0.0071540381999999989]
t=5: 0.0278015638

print '后向算法backforward的测试:'
for t in range(5,-2,-1):
    print 't=%d:'%t,backforward(A,B,PI,O,t)

后向算法backforward的测试:
t=5: [1, 1, 1]
t=4: [1, 1, 1]
t=3: [0.41000000000000003, 0.47999999999999998, 0.46999999999999997]
t=2: [0.18130000000000002, 0.219, 0.2107]
t=1: [0.11876500000000001, 0.10648200000000001, 0.10678700000000001]
t=0: [0.046159970000000002, 0.052981260000000002, 0.052530590000000002]
t=-1: 0.0278015638

print '计算概率backfront的测试:'
for t in range(-1,6):
    print 't=%d:'%t,backfront(A,B,PI,O,t)

计算概率backfront的测试:
t=-1: 0.0278015638
t=0: 0.0278015638
t=1: 0.0278015638
t=2: 0.0278015638
t=3: 0.0278015638
t=4: 0.0278015638
t=5: 0.0278015638

一些概率和期望值的计算

def gammat(A,B,PI,O,t,i):
    prosingle = frontforward(A,B,PI,O,t)[i]*backforward(A,B,PI,O,t)[i]
    prosum = np.sum([frontforward(A,B,PI,O,t)[i]*backforward(A,B,PI,O,t)[i] for j in range(n)])
    return 1.0*prosingle/prosum
def xit(A,B,PI,O,t,i,j):
    prosingle = frontforward(A,B,PI,O,t)[i]*backforward(A,B,PI,O,t+1)[j]*A[i,j]*B[j][O[t+1]]
    prosum = np.sum([frontforward(A,B,PI,O,t)[i]*backforward(A,B,PI,O,t+1)[j]*A[i,j]*B[j][O[t+1]] for i in range(n) for j in range(n)])
    return 1.0*prosingle/prosum

对数空间的运算

定义了一个类,用于对数空间的数值计算

class Logspace:
    def __init__(self):
        self.LOGZERO = 'LOGZERO'
    def eexp(self,x):
        if x == self.LOGZERO:
            return 0
        else:
            return np.exp(x)
    def eln(self,x):
        if x == 0:
            return self.LOGZERO
        elif x>0:
            return np.log(x)
        else:
            print 'Wrong!!!\n\t negative input error'
            return np.nan
    def elnsum(self,elnx,elny):
        if elnx == self.LOGZERO:
            return elny
        elif elny == self.LOGZERO:
            return elnx
        elif elnx > elny:
            return elnx + self.eln(1+np.exp(elny-elnx))
        else:
            return elny + self.eln(1+np.exp(elnx-elny))
    def elnproduct(self,elnx,elny):
        if elnx == self.LOGZERO or elny == self.LOGZERO:
            return self.LOGZERO
        else:
            return elnx + elny

下面是对该类Logspace 的测试语句和测试结果:

logspace = Logspace()

elnx = logspace.eln(np.exp(60))
elny = logspace.eln(np.exp(-10))
elnx_plus_y = logspace.elnsum(elnx,elny)
print elnx_plus_y
elnx_prod_y = logspace.elnproduct(elnx,elny)
print elnx_prod_y
logzero = logspace.eln(0)
elnx_plus_y = logspace.elnsum(elnx,logzero)
print elnx_plus_y
elnx_prod_y = logspace.elnproduct(elnx,logzero)
print elnx_prod_y
print logspace.eln(-1)

测试结果:

60.0
50.0
60.0
LOGZERO
Wrong!!!
negative input error
nan

  • 隐马尔科夫模型(HMM)及其实现
    http://blog.csdn.net/ycheng_sjtu/article/details/38865395
  • 大内密探HMM(隐马尔可夫)围捕赌场老千
    http://blog.csdn.net/ppn029012/article/details/8923501
  • 如何用简单易懂的例子解释隐马尔可夫模型?
    http://www.zhihu.com/question/20962240
  • 隐马尔科夫模型
    http://www.52nlp.cn/category/hidden-markov-model/page/4
  • 发现一篇不错的学习隐马尔可夫模型的文章
    http://blog.sciencenet.cn/blog-641976-533895.html
  • Hidden Markov Models in Python
    http://www.cs.colostate.edu/~anderson/cs440/index.html/doku.php?id=notes:hmm2
  • 经典论文 http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=18626
    《A tutorial on Hidden Markov Models and selected applications in speech recognition》
  • http://www.cnblogs.com/CheeseZH/p/4229910.html

你可能感兴趣的:(DataMining,机器学习理论与实践)