HTK-HVite
This operation is similar to the HLEd word-to-phone mapping operation , however in this case the HVite command can consider all pronunciations for each word (in the case where a word has more than one pronunciations), and then output the pronunciation that best matches the acoustic data.
HVite的目的在于为每一个word后加一个sp
比如:
"*/sample1.lab"
DIAL
ONE
TWO
THREE
FOUR
FIVE
SIX
SEVEN
EIGHT
NINE
OH
ZERO
.
经过word-to-phone转换后:
"*/sample1.lab"
sil
d
ay
l
w
ah
n
t
uw
th
r
iy
f
ow
r
f
ay
v
s
ih
k
s
s
eh
v
ih
n
ey
t
n
ay
n
ow
z
iy
r
ow
sil
.
每个词之间有一个小的sp,处理后的结果为:
"*/sample1.lab"
sil
d
ay
l
sp
w
ah
n
sp
t
uw
sp
th
r
iy
sp
f
ow
r
sp
f
ay
v
sp
s
ih
k
s
sp
s
eh
v
ih
n
sp
ey
t
sp
n
ay
n
sp
ow
sp
z
iy
r
ow
sp
sil
.
指令:
$ HVite -A -D -T 1 -l '*' -o SWT -b SENT-END -C config -H hmm7/macros -H hmm7/h
mmdefs -i aligned.mlf -m -t 250.0 150.0 1000.0 -y lab -a -I words.mlf -S train.
scp dict monophones1>HVite_log
-l dir :
This specifies the directory to store the output label files.If this option is not used then HVite will store the label files in the same directory as the data.In particular,setting the option -l ’*’ will cause a label file named xxx to be prefixed by the patternn "*/xxx" in the output MLF file. This is useful for generating MLFs which are independent of the location of the corresponding data files.
-o s :
-o s output label formating NCSTWMX
Choose how the output labels should be formatted. s is a string with certain letters (from NSCTWM) indicating binary flags that control formatting options.
N normalise acoustic scores by dividing by the duration (in frames) of the segment.
S remove scores from output label. By default scores will be set to the total likelihood of the segment.
C Set the transcription labels to start and end on frame centres. By default start times are set to the start time of the frame and end times are set to the end time of the frame.
T Do not include times in output label files. W Do not include words in output label files when performing state or model alignment.
M Do not include model names in output label files when performing state
and model alignment.
-b s def s as utterance boundary word
-H mmf Load HMM macro file mmf
-i s Output transcriptions to MLF s
-y s output label file extension
-a align from label files off
-b s def s as utterance boundary word none
-c f tied mixture pruning threshold 10.0
-d s dir to find hmm definitions current
-e save direct audio rec output off
-f output full state alignment off
-g enable audio replay off
-h s set speaker name pattern *.mfc
-i s Output transcriptions to MLF s off
-j i Online MLLR adaptation off
Perform update every i utterances
-k use an input transform off
-l s dir to store label/lattice files current
-m output model alignment off
-n i [N] N-best recognition (using i tokens) off
-o s output label formating NCSTWMX none
-p f inter model trans penalty (log) 0.0
-q s output lattice formating ABtvaldmn tvaldmn
-r f pronunciation prob scale factor 1.0
-s f grammar scale factor 1.0
-t f [f f] set pruning threshold 0.0
-u i set pruning max active 0
-v f set word end pruning threshold 0.0
-w [s] recognise from network off
-x s extension for hmm files none
-y s output label file extension rec
-z s generate lattices with extension s off
-A Print command line arguments off
-B Save HMMs/transforms as binary off
-C cf Set config file to cf default
-D Display configuration variables off
-E s [s] set dir for parent xform to s off
and optional extension
-F fmt Set source data format to fmt as config
-G fmt Set source label format to fmt as config
-H mmf Load HMM macro file mmf
-I mlf Load master label file mlf
-J s [s] set dir for input xform to s none
and optional extension
-K s [s] set dir for output xform to s none
and optional extension
-L dir Set input label (or net) dir current
-P Set target label format to fmt as config
-S f Set script file to f none
-T N Set trace flags to N 0
-V Print version information off
-X ext Set input label (or net) file ext lab