kaldi中的深度神经网络

一、TDNN

参考:https://blog.csdn.net/qq_14962179/article/details/87926351

二、nnet3

Kaldi 中实现的 dnn 共 4 种:

a) nnet1 - 基于 Karel's 的实现,特点:简单,仅支持单 GPU, 由 Karel 维护。

b) nnet2 - 基于 Daniel Povey p-norm 的实现,特点:灵活,支持多 GPU、CPU,由 Daniel 维护。

c) nnet3 - nnet2 的改进,由 Daniel 维护。

d.)(nnet3 + chain) - Daniel Povey 改进的 nnet3, 特点:可以实现实时解码,解码速率为 nnet3 的 3~5 倍。

三、chain-model

The 'chain' models are a type of DNN-HMM model, implemented using nnet3, and differ from the conventional model in various ways; you can think of them as a different design point in the space of acoustic models.

  • We use a 3 times smaller frame rate at the output of the neural net, This significantly reduces the amount of computation required in test time, making real-time decoding much easier.
  • The models are trained right from the start with a sequence-level objective function– namely, the log probability of the correct sequence. It is essentially MMI implemented without lattices on the GPU, by doing a full forward-backward on a decoding graph derived from a phone n-gram language model.
  • Because of the reduced frame rate, we need to use unconventional HMM topologies (allowing the traversal of the HMM in one state).
  • We use fixed transition probabilities in the HMM, and don't train them (we may decide train them in future; but for the most part the neural-net output probabilities can do the same job as the transition probabilities, depending on the topology).
  • Currently, only nnet3 DNNs are supported (see The "nnet3" setup), and online decoding has not yet been implemented (we're aiming for April to June 2016).
  • Currently the results are a bit better then those of conventional DNN-HMMs (about 5% relative better), but the system is about 3 times faster to decode; training time is probably a bit faster too, but we haven't compared it exactly.

(摘自kaldi官方文档:https://www.kaldi-asr.org/doc/chain.html)

你可能感兴趣的:(dnn,人工智能,神经网络)