HTK 语音识别 ASK

大词汇量连续语音识别用声韵母建模

在新增加一个读音时 可以只修改dict与gram即可

 

HTK中识别结果到时间转换

13600000 16320000 hao -1452.207031

直接除以10的7次方

hao的发音从1.36秒到1.632秒 也就是说HTK是以100微秒作为基本单位输出

 

 

HTK bad data or over pruning 报警

 

找到一封邮件

 

问:

Hi,

Omer Moav wrote:

Processing Data: cmu_us_arctic_awb_a0015.cmp; Label cmu_us_arctic_awb_a0015.lab

 Unable to traverse 105 states in 1 frames

 WARNING [-7324]  StepBack: File

/home/omergil/downloads/HTS-demo_CMU-ARCTIC-AWB/cmp/cmu_us_arctic_awb_a0015.cmp

- bad data or over pruning

 in /home/omergil/downloads/htk/bin.linux/HERest

The above warning means that cmu_us_arctic_awb_a0015.cmp is corrupted.

I recommend you to re-run it.

 

 

我根据报警信息 查找对应的报警文件 通过查看 我的错误信息是因为切割语音过小 基本上只有半个发音 所以训练的时候识别不出来 会出一个警告

 

 

三因素模型状态绑定

...(原因 没有加入sp静音模型)

WB sp

WB sil

TC

sp与sil负责断句

sil th ih s sp m ae n sp...

 HLEd.exe -n ./lists/triphones1  -l '*' -i ./labels/wintri.mlf mktri.led ./labels/aligned.mlf

之后 变成

sil th+ih th-ih+s ih-s sp m+ae m-ae+n ae-n sp...

 

 

perl ./scripts/maketrihed ./lists/monophones1 ./lists/triphones1产生的mktri.hed解释

CL ./lists/triphones1

TI T_b {(*-b+*,b+*,*-b).transP}

TI T_p {(*-p+*,p+*,*-p).transP}

...

CL表示克隆的意思

TI表示连接

 

HERest -B -C ./config/config2 -I ./labels/wintri.mlf -t 250.0 150.0 1000.0 -s stats -S train.scp -H ./hmms/hmm11/macros -H ./hmms/hmm11/hmmsdef -M ./hmms/hmm12 ./lists/triphones1

-B 以二进制方式存储文件 -s stats 生成stats文件

 

构造决策树

 

 

识别结果 correct accurate

1、词正确识别率

correct  = (N - D - S)%N * 100%

2)、识别精度

accurate = (N - D - S -I)%N * 100%

N:原始脚本文件中词的个数

D:识别结果对应于参考句子脚本中删除的词的个数

S:识别结果对应于参考句子脚本中替换的词的个数

I:识别结果对应于参考句子脚本中插入的词的个数

 

  gram.txt文件中将 ( SENT-START (<$word>|) SENT-END)改为

<$word> word发音中增加SENT-START 效果失败

效果成功 $word=a|ai|an|ang|ao|ba|bai|ban|bang|bao|bei|ben|beng|bi|bian|biao|bie|bin|bing|

bo|bu|ca|cai|can|cang|cao|ce|cen|ceng|cha|chai|chan|chang|chao|che|chen|cheng|chi|

chong|chou|chu|chua|chuai|chuan|chuang|chui|chun|chuo|ci|cong|cou|cu|cuan|cui|cun|

cuo|da|dai|dan|dang|dao|de|den|deng|di|dia|dian|diao|die|ding|diu|dong|dou|du|duan|

dui|dun|duo|e|en|eng|er|fa|fan|fang|fei|fen|feng|fo|fu|ga|gai|gan|gang|gao|ge|gei|

gen|geng|gong|gou|gu|gua|guai|guan|guang|gui|gun|guo|ha|hai|han|hang|hao|he|hei|hen|

heng|hong|hou|hu|hua|huai|huan|huang|hui|hun|huo|ji|jia|jian|jiang|jiao|jie|jin|jing|jiong|jiu|jv|jvan|jve|jvn|ka|kai|kan|kang|kao|ke|ken|keng|kong|kou|ku|kua|kuai|kuan|

kuang|kui|kun|kuo|la|lai|lan|lang|lao|le|lei|leng|li|lia|lian|liang|liao|lie|lin|

ling|liu|lo|long|lou|lu|luan|lve|lun|luo|lv|ma|mai|man|mang|mao|me|mei|men|meng|mi|

mian|miao|mie|min|ming|miu|mo|mou|mu|na|nai|nan|nang|nao|ne|nei|nen|neng|ni|nian|

niang|niao|nie|nin|ning|niu|nong|nu|nve|nv|nuan|nuo|nun|o|ou|pa|pai|pan|pang|pao|

pei|pen|peng|pi|pian|piao|pie|pin|ping|po|pou|pu|qi|qia|qian|qiang|qiao|qie|qin|

qing|qiong|qiu|qv|qvan|qve|qvn|ran|ang|rao|re|ren|reng|ri|rong|rou|ru|rua|ruan|rui|

run|ruo|sa|sai|san|sang|sao|se|sen|seng|sha|shai|shan|shang|shao|she|shen|sheng|shi|

shou|shu|shua|shuai|shuan|shui|shun|shuo|si|song|sou|su|suan|sui|sun|suo|ta|tai|tan|

tang|tao|te|tei|teng|ti|tian|tiao|tie|ting|tong|tou|tu|tuan|tui|tun|tuo|wa|wai|wan|

wang|wei|wen|weng|wo|wu|xi|xia|xian|xiang|xiao|xie|xin|xing|xiong|xiu|xv|xvan|xve|

xvn|ya|yan|yang|yao|ye|yv|yi|yin|ying|yo|yong|you|yvan|yve|yvn|za|zai|zan|zang|zao|

ze|zei|zen|zeng|zha|zhai|zhan|zhang|zhao|zhe|zhen|zheng|zhi|zhong|zhou|zhu|zhua|

zhuai|zhuan|zhuang|zhui|zhun|zhuo|zi|zong|zou|zu|zuan|zui|zun|zuo|fou|shuang|silence;

( SENT-START <$word> SENT-END)

Yes

 

 

 

l   上下文相关建模可以很好地解决

l 细化建模

l 协同发音

l 发音变异

l 口音

 

 

l   参数共享级别

l 模型级(model-level

l 状态级(state-level

l 混合级(mixture-level

l 其它各种参数的共享(如转移矩阵、中心、方差、混合权重 等)

 

vFloors的产生与作用

HCompV has a number of options specified fori t.The -f option causes avariance floor macro

(called vFloors) to be generated which is equal to 0.01 times the global variance.This is a vector of

values which will be used to set a floor on the variances estimated in the subsequent steps.

产生下一阶段variances的初始化值

 

 

D:/tryputong>HVite -H ./hmms/hmm12/macros -H ./hmms/hmm12/hmmsdef -S test.scp -l

 * -i ./results/recout_step1.mlf -w wdnet -p 0.0 -s 5.0 ./dict/dict2 ./lists/tri

phones1

  ERROR [+8231]  GetHCIModel: Cannot find hmm [t-]ei[+???]

 FATAL ERROR - Terminating program HVite

 

 

1 正则表达式 <>一个或者多个[]0个或者一个

 

数学之美 系列三 -- 隐含马尔可夫模型在语言处理中的应用

http://www.google.cn/ggblog/googlechinablog/2006/04/blog-post_1583.html

你可能感兴趣的:(语音识别)