语音识别-kaldi下载与安装

Kaldi是为语音识别开发者提供的非常强大的工具库,其名字来自一个传说中发现咖啡树的牧羊人,由C++编码,目前支持GMM-HMM、SGMM-HMM、DNN-HMM等多种语音识别的模型的训练和预测。其现在和安装也很方便。

下载:

像所有GitHub上的开源项目一样,通过git clone完成对kaldi的下载,未安装git的情况下可以先安装git,当然也可以通过下载压缩包,解压后进行安装操作。

   git clone https://github.com/kaldi-asr/kaldi.git kaldi --origin upstream

安装:

进入下载好的Kadli目录:

cd kaldi

在下载好的Kaldi根目录下,打开INSTALL文件,可以看到下面的内容

This is the official Kaldi INSTALL. Look also at INSTALL.md for the git mirror installation.
[for native Windows install, see windows/INSTALL]

(1)
go to tools/  and follow INSTALL instructions there.

(2)
go to src/ and follow INSTALL instructions there.

根据步骤1 我们先进入tools文件夹:

cd tools

 在tools文件夹下,也有一个INSTALL文件,内容如下:

To check the prerequisites for Kaldi, first run

  extras/check_dependencies.sh

and see if there are any system-level installations you need to do. Check the
output carefully. There are some things that will make your life a lot easier
if you fix them at this stage. If your system default C++ compiler is not
supported, you can do the check with another compiler by setting the CXX
environment variable, e.g.

  CXX=g++-4.8 extras/check_dependencies.sh

Then run

  make

which by default will install ATLAS headers, OpenFst, SCTK and sph2pipe.
OpenFst requires a relatively recent C++ compiler with C++11 support, e.g.
g++ >= 4.7, Apple clang >= 5.0 or LLVM clang >= 3.3. If your system default
compiler does not have adequate support for C++11, you can specify a C++11
compliant compiler as a command argument, e.g.

  make CXX=g++-4.8

If you have multiple CPUs and want to speed things up, you can do a parallel
build by supplying the "-j" option to make, e.g. to use 4 CPUs

  make -j 4

In extras/, there are also various scripts to install extra bits and pieces that
are used by individual example scripts.  If an example script needs you to run
one of those scripts, it will tell you what to do.

根据以上内容,首先运行脚本为下面的编译做准备工作:

extras/check_dependencies.sh

在脚本运行的过程中,根据计算机安装软件的情况可能会出现需要安装相应软件的情况,可以根据提示进行相应的安装,例如下面的情况:

extras/check_dependencies.sh: we recommend that you run (our best guess):

sudo apt-get install automake autoconf libtool subversion

You should probably do:

sudo apt-get install libatlas3-base

/bin/sh is linked to dash, and currently some of the scripts will not run properly. We recommend to run:

sudo ln -s -f bash /bin/sh

可以根据上面的提示运行下面的命令即可,具体根据自己安装时所产生的提示进行安装:

sudo apt-get install automake autoconf libtool subversion
sudo apt-get install libatlas3-base
sudo ln -s -f bash /bin/sh

安装完成后,将出现下面的提示:

extras/check_dependencies.sh: all OK. 

然后进行编译,-j 9 意思是使用9块CPU进行并行编译,在家用笔记本上的话,大部分也没这么块CPU,可以将其省略:

make -j 9

运行结果如下:

Warning: IRSTLM is not installed by default anymore. If you need IRSTLM
Warning: use the script extras/install_irstlm.sh
All done OK.

 从上面的结果可以看到警告信息,根据警告信息,可以根据自己开发所需按警告操作。至此,tools全部安装完毕。

根据步骤2,我们进入src目录:

cd ../src

在src目录下,仍然有INSTALL文件,打开可见:

These instructions are valid for UNIX-like systems (these steps have
been run on various Linux distributions; Darwin; Cygwin).  For native Windows
compilation, see ../windows/INSTALL.

You must first have completed the installation steps in ../tools/INSTALL
(compiling OpenFst; getting ATLAS and CLAPACK headers).

The installation instructions are

  ./configure --shared
  make depend -j 8
  make -j 8

Note that we added the "-j 8" to run in parallel because "make" takes a long
time.  8 jobs might be too many for a laptop or small desktop machine with not
many cores.

For more information, see documentation at http://kaldi-asr.org/doc/
and click on "The build process (how Kaldi is compiled)".

 按照内容,首先执行:

./configure --shared

安装成功后,可以看到SUCCESS,当然,SUCCESS上面的信息也要看一下,如果需要的话,可以按照信息操作:

CUDA will not be used! If you have already installed cuda drivers
and cuda toolkit, try using --cudatk-dir=... option.  Note: this is
only relevant for neural net experiments
Info: configuring Kaldi not to link with Speex (don't worry, it's only needed if you
intend to use 'compress-uncompress-speex', which is very unlikely)
SUCCESS
To compile: make clean -j; make depend -j; make -j
 ... or e.g. -j 10, instead of -j, to use a specified number of CPUs

 编译src代码, -j参数根据需要添加:

make depend
make 

此过程在个人电脑上跑会很漫长,可以去约上妹子,吃个饭甚至....再回来看结果,结果如下:

Done

至此,完成kaldi的下载与安装,可以愉快的使用了.... 

你可能感兴趣的:(语音识别)