解析kaldi中yesno的hmm的用法

本文解释hmm这个fst的使用方法,如何将特征向量映射到元音phone的。

只解释用法,不解释如何生成hmm和model。

在yesno/s5/exp/mono0a/graph_tgpr目录下打印Ha.fst的内容
boystray@boystray-All-Series:~/kaldi/egs/yesno/s5/exp/mono0a/graph_tgpr$ fstprint  Ha.fst 
0	1	0	1
0	7	20	2
0	10	26	3
0	13	31	4
0
1	2	2	0	3.26078463
1	3	3	0	0.771307588
1	4	4	0	0.694674611
2	3	6	0	1.60320568
2	4	7	0	0.514737606
2	5	8	0	1.60320568
3	2	9	0	2.19029689
3	4	11	0	0.253189087
3	5	12	0	2.19029689
4	2	13	0	2.36310244
4	3	14	0	2.36310244
4	5	16	0	0.208477736
5	6	18	0
6	0	0	0
7	8	22	0
8	9	24	0
9	0	0	0
10	11	28	0
11	12	30	0	-2.38418579e-07
12	0	0	0
13	0	0	0

这里的第1列是源节点,第二列是目标节点,第3列是Transition-id,第4列是phone id。

Transition-id可以通过show-transitions获得
boystray@boystray-All-Series:~/kaldi/egs/yesno/s5/exp/mono0a$ ~/kaldi/src/bin/show-transitions phones.txt 0.mdl
/home/boystray/kaldi/src/bin/show-transitions phones.txt 0.mdl 
Transition-state 1: phone = SIL hmm-state = 0 pdf = 0
 Transition-id = 1 p = 0.25 [self-loop]
 Transition-id = 2 p = 0.25 [0 -> 1]
 Transition-id = 3 p = 0.25 [0 -> 2]
 Transition-id = 4 p = 0.25 [0 -> 3]
Transition-state 2: phone = SIL hmm-state = 1 pdf = 1
 Transition-id = 5 p = 0.25 [self-loop]
 Transition-id = 6 p = 0.25 [1 -> 2]
 Transition-id = 7 p = 0.25 [1 -> 3]
 Transition-id = 8 p = 0.25 [1 -> 4]
Transition-state 3: phone = SIL hmm-state = 2 pdf = 2
 Transition-id = 9 p = 0.25 [2 -> 1]
 Transition-id = 10 p = 0.25 [self-loop]
 Transition-id = 11 p = 0.25 [2 -> 3]
 Transition-id = 12 p = 0.25 [2 -> 4]
Transition-state 4: phone = SIL hmm-state = 3 pdf = 3
 Transition-id = 13 p = 0.25 [3 -> 1]
 Transition-id = 14 p = 0.25 [3 -> 2]
 Transition-id = 15 p = 0.25 [self-loop]
 Transition-id = 16 p = 0.25 [3 -> 4]
Transition-state 5: phone = SIL hmm-state = 4 pdf = 4
 Transition-id = 17 p = 0.75 [self-loop]
 Transition-id = 18 p = 0.25 [4 -> 5]
Transition-state 6: phone = Y hmm-state = 0 pdf = 5
 Transition-id = 19 p = 0.75 [self-loop]
 Transition-id = 20 p = 0.25 [0 -> 1]
Transition-state 7: phone = Y hmm-state = 1 pdf = 6
 Transition-id = 21 p = 0.75 [self-loop]
 Transition-id = 22 p = 0.25 [1 -> 2]
Transition-state 8: phone = Y hmm-state = 2 pdf = 7
 Transition-id = 23 p = 0.75 [self-loop]
 Transition-id = 24 p = 0.25 [2 -> 3]
Transition-state 9: phone = N hmm-state = 0 pdf = 8
 Transition-id = 25 p = 0.75 [self-loop]
 Transition-id = 26 p = 0.25 [0 -> 1]
Transition-state 10: phone = N hmm-state = 1 pdf = 9
 Transition-id = 27 p = 0.75 [self-loop]
 Transition-id = 28 p = 0.25 [1 -> 2]
Transition-state 11: phone = N hmm-state = 2 pdf = 10
 Transition-id = 29 p = 0.75 [self-loop]
 Transition-id = 30 p = 0.25 [2 -> 3]


而phone id在phones.txt文件中。
phones.txt文件如下
 0
SIL 1
Y 2
N 3
#0 4
#1 5

有了上面的基础,再看看Ha.fst最开始的几行

源节点 目标节点 Transition-id   phone id

0      7          20       2 识别出Y
0      10         26       3 识别出N
0      13         31       4  识别出#0

那么就识别出了元音phone,后续再通过HCLG,依次识别出word和句子。

 

你可能感兴趣的:(解析kaldi中yesno的hmm的用法)