HTK搭建大词汇量连续语音识别系统(三)

十、绑定三音素

脚本mktri.led

WB sp
WB sil
TC

执行命令:

HLEd -n labels/triphones1 -i labels/wintry.mlf scripts/mktri.led labels/aligned.mlf

脚本mktri.hed由脚本命令生成:

perl scripts/maketrihed data/monophones1 labels/triphones1

把mktri.hed放在scripts文件夹中,执行命令:

HHEd -H hmms/hmm9/macros -H hmms/hmm9/hmmdefs -M hmms/hmm10 scripts/mktri.hed data/monophones1

重估两次:

HERest  -A -D -T 1 -C def/config -I labels/wintri.mlf -t 250.0 150.0 1000.0 -S def/train.scp -H hmms/hmm11/macros -H hmms/hmm11/hmmdefs -M hmms/hmm12 labels/triphones1

HERest  -A -D -T 1 -C def/config -I labels/wintri.mlf -t 250.0 150.0 1000.0 -s stats -S def/train.scp -H hmms/hmm11/macros -H hmms/hmm11/hmmdefs -M hmms/hmm12 labels/triphones1

使用脚本生成fullisttielist文件,命令分别如下:

(1)HDMan -b sp -n lists/fullist -g global3.ded -l floag dict/dict4-tri dict/dict4

其中global3.ded如下:

RS cmu

MP sil sil sp

TC

dict4为去掉SEND-STARTSILEN的新字典,其中生成的fullist中缺少sil,ay,em,ow的音节,加上再执行下面(2)。

HHEd -H hmms/hmm12/macros -H hmms/hmm12/hmmdefs -M hmms/hmm13 tree.hed labels/triphones1>log

其中tree.hed使用脚本生成:

perl scripts/mkclscript.prl TB 350.0 data/monophones1>tree.hed

生成的tree.hed只有TB开头,在htk自带的例子中HTK\samples\RMHTK\libquests.hed 文件复制到前面部分,然后在开头、中间、结尾分别加上如下语句:

(添加)RO 100 stats
(添加)TR 0
QS  "R_NonBoundary"           { *+* }
QS  "R_Silence"                      { *+sil }
QS  "R_Stop"              { *+p,*+pd,*+b,*+t,*+td,*+d,*+dd,*+k,*+kd,*+g }
………………….
QS  "L_y"                   { y-* }
QS  "L_z"                    { z-* }
(添加)TR 2
TB 350.0 "ST_ax_2_" {("ax","*-ax+*","ax+*","*-ax").state[2]}
TB 350.0 "ST_b_2_" {("b","*-b+*","b+*","*-b").state[2]}
TB 350.0 "ST_r_2_" {("r","*-r+*","r+*","*-r").state[2]}
……………………
TB 350.0 "ST_sil_4_" {("sil","*-sil+*","sil+*","*-sil").state[4]}
TB 350.0 "ST_sp_4_" {("sp","*-sp+*","sp+*","*-sp").state[4]}
(添加)TR 1
(添加)AU lists/fulllist
(添加)CO lists/tiedlist
(添加)ST trees
然后重估两次:
HERest -C def/config -I labels/wintry.mlf -t 250.0 150.0 1000.0 -S def/train.scp -H hmms/hmm13/macros -H hmms/hmm13/hmmdefs -M hmms/hmm14 lists/tiedlist

HERest -C def/config -I labels/wintry.mlf -t 250.0 150.0 1000.0 -S def/train.scp -H hmms/hmm14/macros -H hmms/hmm14/hmmdefs -M hmms/hmm15 lists/tiedlist
 
 

再次评估测试:

命令:

HVite -C def/config2 -H hmms/hmm15/macros -H hmms/hmm15/hmmdefs -S test/test.scp -l * -I results/recout_hmm15.mlf -w dict/wdnet -p 0.0 -s 5.0 dict/dict3 lists/tiedlist

其中config2在config的基础上加上:

FORCECXTEXP = T
ALLOWXWRDEXP = F

运行1个半小时。。。

命令:

HResults -I rest/testwords.mlf lists/tiedlist results/recout_hmm15.mlf

结果如下:

HTK搭建大词汇量连续语音识别系统(三)_第1张图片

为什么句子的识别率为0呢???

 

 

你可能感兴趣的:(语音识别,htk,timit,连续语音)