kaldi中的HMM模型,实际就是一个TransitionModel对象。这个对象描述了音素的HMM拓扑结构,并保存了pdf-id和transition-id相关的信息,并且可以进行各种变量的转换。
TransitionModel的定义和实现位于transition-model.h和transition-model.cc中。在了解此对象之前,应先阅读和理解hmm-topology相关的内容。
在介绍TransitionModel之前,先介绍一些概念。
phone: 音素,从1开始编号。可以根据phones.txt映射为具体音素
HMM-state: 音素HMM模型的状态,从0开始编号
pdf-id: 决策树和声学模型中用到的pdf的编号,从0开始
transition-state: 一个(虚拟的)状态,通过弧跳转到自己或其他状态。某些情况下,可以跟pdf-id一一对应。
transition-index: HMM状态中转移的索引,即HmmTopology::HmmState::transitions的索引,从0开始编号
transition-id: 所有的HMM状态的弧进行编号。从1开始编号。
通常,将phone、HMM-state和pdf-id(包括forward-pdf-id, self-loop-pdf-id)作为一个元组(Tuple),一个元组,可映射为一个transition-state。transition-state加一个具体的transition-index,可以映射出一个transition-id。各种映射关系如下:
(phone, HMM-state, forward-pdf-id, self-loop-pdf-id) -> transition-state
(transition-state, transition-index) -> transition-id
同时也存在着反向的映射关系,即:
transition-id -> transition-state
transition-id -> transition-index
transition-state -> phone
transition-state -> HMM-state
transition-state -> forward-pdf-id
transition-state -> self-loop-pdf-id
kaldi中TransitionModel的定义如下,为了方便阅读和理解,对代码做了修改。
class TransitionModel {
public:
TransitionModel() { }
void Read(std::istream &is, bool binary);
void Write(std::ostream &os, bool binary) const;
/// return reference to HMM-topology object.
const HmmTopology &GetTopo() const { return topo_; }
/// \name Integer mapping functions
/// @{
int32 TupleToTransitionState(int32 phone, int32 hmm_state, int32 pdf, int32 self_loop_pdf) const;
int32 PairToTransitionId(int32 trans_state, int32 trans_index) const;
int32 TransitionIdToTransitionState(int32 trans_id) const; //return id2state_[trans_id];
int32 TransitionIdToTransitionIndex(int32 trans_id) const;
int32 TransitionStateToPhone(int32 trans_state) const; //return tuples_[trans_state-1].phone;
int32 TransitionStateToHmmState(int32 trans_state) const;
int32 TransitionStateToForwardPdfClass(int32 trans_state) const;
int32 TransitionStateToSelfLoopPdfClass(int32 trans_state) const;
int32 TransitionStateToForwardPdf(int32 trans_state) const;
int32 TransitionStateToSelfLoopPdf(int32 trans_state) const;
int32 SelfLoopOf(int32 trans_state) const; // returns the self-loop transition-id, or zero if
// this state doesn't have a self-loop.
inline int32 TransitionIdToPdf(int32 trans_id) const; //return id2pdf_id_[trans_id];
int32 TransitionIdToPhone(int32 trans_id) const; //return tuples_[id2state_[trans_id]-1].phone;
int32 TransitionIdToPdfClass(int32 trans_id) const;
int32 TransitionIdToHmmState(int32 trans_id) const;
/// Returns the total number of transition-ids (note, these are one-based).
inline int32 NumTransitionIds() const { return id2state_.size()-1; }
/// Returns the number of transition-indices for a particular transition-state.
/// Note: "Indices" is the plural of "index". Index is not the same as "id",
/// here. A transition-index is a zero-based offset into the transitions
/// out of a particular transition state.
int32 NumTransitionIndices(int32 trans_state){
return state2id_[trans_state+1]-state2id_[trans_state];
}
/// Returns the total number of transition-states (note, these are one-based).
int32 NumTransitionStates() const { return tuples_.size(); }
// NumPdfs() actually returns the highest-numbered pdf we ever saw, plus one.
// In normal cases this should equal the number of pdfs in the system, but if you
// initialized this object with fewer than all the phones, and it happens that
// an unseen phone has the highest-numbered pdf, this might be different.
int32 NumPdfs() const { return num_pdfs_; }
BaseFloat GetTransitionLogProb(int32 trans_id){
return log_probs_(trans_id);
}
private:
struct Tuple {
int32 phone;
int32 hmm_state;
int32 forward_pdf;
int32 self_loop_pdf;
Tuple() { }
Tuple(int32 phone, int32 hmm_state, int32 forward_pdf, int32 self_loop_pdf):
phone(phone), hmm_state(hmm_state), forward_pdf(forward_pdf), self_loop_pdf(self_loop_pdf) { }
};
HmmTopology topo_;
/// Triples indexed by transition state minus one;
/// the triples are in sorted order which allows us to do the reverse mapping from
/// triple to transition state
std::vector
/// Gives the first transition_id of each transition-state; indexed by
/// the transition-state. Array indexed 1..num-transition-states+1 (the last one
/// is needed so we can know the num-transitions of the last transition-state.
std::vector
/// For each transition-id, the corresponding transition
/// state (indexed by transition-id).
std::vector
std::vector
/// For each transition-id, the corresponding log-prob. Indexed by transition-id.
Vector
/// For each transition-state, the log of (1 - self-loop-prob). Indexed by
/// transition-state.
Vector
/// This is actually one plus the highest-numbered pdf we ever got back from the
/// tree (but the tree numbers pdfs contiguously from zero so this is the number
/// of pdfs).
int32 num_pdfs_;
};
实际写入模型文件(如final.mdl)中的HMM模型就是一个TransitionModel对象。但是写入到文件中的,并不是所有成员变量。只是写入了topo_、tuples_和log_probs_这三项。其他项,都是在后来计算出来的。下面的表格,也对几个成员变量,做了汇总介绍。
表格中,“tr_state”表示transition-state。