HCLG文件详细分析

HCLG

L.fst: The Phonetic Dictionary FST

maps monophone sequences to words.

The file L.fst is the Finite State Transducer form of the lexicon with phone symbols on the input and word symbols on the output.

L_disambig.fst:The Phonetic Dictionary with Disambiguation Symbols FST

A lexicon with disambiguation symbols

G.fst:The Language Model FST

FSA grammar (can be built from an n-gram grammar).

C.fst:The Context FST

C maps triphone sequences to monophones.

Expands the phones into context-dependent phones.

H.fst:The HMM FST

H maps multiple HMM states (a.k.a. transition-ids in Kaldi-speak) to context-dependent triphones.

Expands out the HMMs. On the right are the context-dependent phones and on the left are the pdf-ids. 

HCLG.fst: final graph

总结一下:

构图过程 G -> L -> C -> H

 G: 作为 acceptor (输入 symbol 与输出相同),用于对grammar 或者 language model进行编码

 L:Lexicon, 其输出 symbol 是 words, 输入 symbol 是 phones

 C:context-dependency其输出 symbol 是 phones, 其输入 symbol 为表示context-dependencyphones

 H: 包括HMM definitions,其输出 symbol 为context-dependencyphones, 其输入 symbol 为transitions-ids(即 对 pdf-id 和 其它信息编码后的 id)

你可能感兴趣的:(HCLG文件详细分析)