hmmlearn实现了三种HMM模型类,按照观测状态是连续状态还是离散状态,可以分为两类。GaussianHMM和GMMHMM是连续观测状态的HMM模型,而MultinomialHMM是离散观测状态的模型,今天讲讲后者。
HMM介绍
一、HMM预测最可能的状态序列
import numpy as np
import pandas as pd
from hmmlearn import hmm
states = ["box 1", "box 2", "box3"]
n_states = len(states)
observations = ["red", "white"]
n_observations = len(observations)
start_probability = np.array([0.2, 0.4, 0.4])
transition_probability = np.array([
[0.5, 0.2, 0.3],
[0.3, 0.5, 0.2],
[0.2, 0.3, 0.5]])
emission_probability = np.array([
[0.5, 0.5],
[0.4, 0.6],
[0.7, 0.3]])
model = hmm.MultinomialHMM(n_components=n_states)
model.startprob_=start_probability
model.transmat_=transition_probability
model.emissionprob_=emission_probability
decode:查找与对应的最可能的状态序列X
seen = np.array([[0,1,0]]).reshape(-1,1)
logprob, box = model.decode(seen, algorithm="viterbi")
print("The ball picked:", ", ".join(map(lambda x: observations[x], seen.ravel())))
print("The hidden box", ", ".join(map(lambda x: states[x], box)))
predict:查找与对应的最可能的状态序列X ,与decode结果一样
box2 = model.predict(seen)
print("The ball picked:", ", ".join(map(lambda x: observations[x], seen.ravel())))
print("The hidden box", ", ".join(map(lambda x: states[x], box2)))
观测这一序列的概率
print(model.score(seen))
在HMM问题一中手动计算的结果是未取对数的原始概率是0.13022
import math
print(math.exp(-2.038545309915233))
二、 求模型参数 Π 、A、 B
states = ["box 1", "box 2", "box3"]
n_states = len(states)
observations = ["red", "white"]
n_observations = len(observations)
model2 = hmm.MultinomialHMM(n_components=n_states, n_iter=20, tol=0.01)
X2 = np.array([[0],[1],[0],[1],[0],[0],[0],[1],[1],[0],[1],[1]])
model2.fit(X2)
print (model2.startprob_)
print (model2.transmat_)
print (model2.emissionprob_)
print (model2.score(X2)) #计算模型下的对数概率
也可以写个循环,寻找概率最大的那组对应的参数
model_res = []
for i in range(5):
mod = model.fit(X2,lengths=[4,4,4])
sco =mod.score(X2)
sta =mod.startprob_
tra =mod.transmat_
emi =mod.emissionprob_
result ={'score':sco,'start':sta,'trans':tra,'emission':emi}
model_res.append(result)
bst = sorted(model_res, key=lambda x:x[0]['score'], reverse=True) #reverse为True,排第一的是最大,否则是最小
bst[0][1] #获取字典