大词汇量连续语音识别用声韵母建模
在新增加一个读音时 可以只修改dict与gram即可
HTK中识别结果到时间转换
13600000 16320000 hao -1452.207031
直接除以10的7次方
hao的发音从1.36秒到1.632秒 也就是说HTK是以100微秒作为基本单位输出
HTK bad data or over pruning 报警
找到一封邮件
问:
Hi,
Omer Moav wrote:
Processing Data: cmu_us_arctic_awb_a0015.cmp; Label cmu_us_arctic_awb_a0015.lab
Unable to traverse 105 states in 1 frames
WARNING [-7324] StepBack: File
/home/omergil/downloads/HTS-demo_CMU-ARCTIC-AWB/cmp/cmu_us_arctic_awb_a0015.cmp
- bad data or over pruning
in /home/omergil/downloads/htk/bin.linux/HERest
答:
The above warning means that cmu_us_arctic_awb_a0015.cmp is corrupted.
I recommend you to re-run it.
我根据报警信息 查找对应的报警文件 通过查看 我的错误信息是因为切割语音过小 基本上只有半个发音 所以训练的时候识别不出来 会出一个警告
三因素模型状态绑定
...(原因 没有加入sp静音模型)
WB sp
WB sil
TC
sp与sil负责断句
sil th ih s sp m ae n sp...
HLEd.exe -n ./lists/triphones1 -l '*' -i ./labels/wintri.mlf mktri.led ./labels/aligned.mlf
之后 变成
sil th+ih th-ih+s ih-s sp m+ae m-ae+n ae-n sp...
perl ./scripts/maketrihed ./lists/monophones1 ./lists/triphones1产生的mktri.hed解释
CL ./lists/triphones1
TI T_b {(*-b+*,b+*,*-b).transP}
TI T_p {(*-p+*,p+*,*-p).transP}
...
CL表示克隆的意思
TI表示连接
HERest -B -C ./config/config2 -I ./labels/wintri.mlf -t 250.0 150.0 1000.0 -s stats -S train.scp -H ./hmms/hmm11/macros -H ./hmms/hmm11/hmmsdef -M ./hmms/hmm12 ./lists/triphones1
-B 以二进制方式存储文件 -s stats 生成stats文件
构造决策树
识别结果 correct accurate
1)、词正确识别率
correct = (N - D - S)%N * 100%
2)、识别精度
accurate = (N - D - S -I)%N * 100%
N:原始脚本文件中词的个数
D:识别结果对应于参考句子脚本中删除的词的个数
S:识别结果对应于参考句子脚本中替换的词的个数
I:识别结果对应于参考句子脚本中插入的词的个数
在gram.txt文件中将 ( SENT-START (<$word>|
<$word> word发音中增加SENT-START 效果失败
效果成功 $word=a|ai|an|ang|ao|ba|bai|ban|bang|bao|bei|ben|beng|bi|bian|biao|bie|bin|bing|
bo|bu|ca|cai|can|cang|cao|ce|cen|ceng|cha|chai|chan|chang|chao|che|chen|cheng|chi|
chong|chou|chu|chua|chuai|chuan|chuang|chui|chun|chuo|ci|cong|cou|cu|cuan|cui|cun|
cuo|da|dai|dan|dang|dao|de|den|deng|di|dia|dian|diao|die|ding|diu|dong|dou|du|duan|
dui|dun|duo|e|en|eng|er|fa|fan|fang|fei|fen|feng|fo|fu|ga|gai|gan|gang|gao|ge|gei|
gen|geng|gong|gou|gu|gua|guai|guan|guang|gui|gun|guo|ha|hai|han|hang|hao|he|hei|hen|
heng|hong|hou|hu|hua|huai|huan|huang|hui|hun|huo|ji|jia|jian|jiang|jiao|jie|jin|jing|jiong|jiu|jv|jvan|jve|jvn|ka|kai|kan|kang|kao|ke|ken|keng|kong|kou|ku|kua|kuai|kuan|
kuang|kui|kun|kuo|la|lai|lan|lang|lao|le|lei|leng|li|lia|lian|liang|liao|lie|lin|
ling|liu|lo|long|lou|lu|luan|lve|lun|luo|lv|ma|mai|man|mang|mao|me|mei|men|meng|mi|
mian|miao|mie|min|ming|miu|mo|mou|mu|na|nai|nan|nang|nao|ne|nei|nen|neng|ni|nian|
niang|niao|nie|nin|ning|niu|nong|nu|nve|nv|nuan|nuo|nun|o|ou|pa|pai|pan|pang|pao|
pei|pen|peng|pi|pian|piao|pie|pin|ping|po|pou|pu|qi|qia|qian|qiang|qiao|qie|qin|
qing|qiong|qiu|qv|qvan|qve|qvn|ran|ang|rao|re|ren|reng|ri|rong|rou|ru|rua|ruan|rui|
run|ruo|sa|sai|san|sang|sao|se|sen|seng|sha|shai|shan|shang|shao|she|shen|sheng|shi|
shou|shu|shua|shuai|shuan|shui|shun|shuo|si|song|sou|su|suan|sui|sun|suo|ta|tai|tan|
tang|tao|te|tei|teng|ti|tian|tiao|tie|ting|tong|tou|tu|tuan|tui|tun|tuo|wa|wai|wan|
wang|wei|wen|weng|wo|wu|xi|xia|xian|xiang|xiao|xie|xin|xing|xiong|xiu|xv|xvan|xve|
xvn|ya|yan|yang|yao|ye|yv|yi|yin|ying|yo|yong|you|yvan|yve|yvn|za|zai|zan|zang|zao|
ze|zei|zen|zeng|zha|zhai|zhan|zhang|zhao|zhe|zhen|zheng|zhi|zhong|zhou|zhu|zhua|
zhuai|zhuan|zhuang|zhui|zhun|zhuo|zi|zong|zou|zu|zuan|zui|zun|zuo|fou|shuang|silence;
( SENT-START <$word> SENT-END)
Yes
l 上下文相关建模可以很好地解决
l 细化建模
l 协同发音
l 发音变异
l 口音
l 参数共享级别
l 模型级(model-level)
l 状态级(state-level)
l 混合级(mixture-level)
l 其它各种参数的共享(如转移矩阵、中心、方差、混合权重 等)
vFloors的产生与作用
HCompV has a number of options specified fori t.The -f option causes avariance floor macro
(called vFloors) to be generated which is equal to 0.01 times the global variance.This is a vector of
values which will be used to set a floor on the variances estimated in the subsequent steps.
产生下一阶段variances的初始化值
D:/tryputong>HVite -H ./hmms/hmm12/macros -H ./hmms/hmm12/hmmsdef -S test.scp -l
* -i ./results/recout_step1.mlf -w wdnet -p 0.0 -s 5.0 ./dict/dict2 ./lists/tri
phones1
ERROR [+8231] GetHCIModel: Cannot find hmm [t-]ei[+???]
FATAL ERROR - Terminating program HVite
1 正则表达式 <>一个或者多个[]0个或者一个
数学之美 系列三 -- 隐含马尔可夫模型在语言处理中的应用
http://www.google.cn/ggblog/googlechinablog/2006/04/blog-post_1583.html