Linux/macOS 安装 Kaldi

Linux/macOS 安装 Kaldi_第1张图片


文章目录

    • 一、关于 kaldi
    • 二、安装
      • 1、下载源码
      • 2、查看 INSTALL 文件
        • root -- INSTALL
        • tools -- INSTALL
        • src -- INSTALL
      • 3、处理tools
        • 安装 mkl
        • 安装 irstlm、kaldi_lm、openblas
      • 4、处理 src
    • 三、测试
      • 报错1:Bad FST header
      • 报错1:gmm-init-mono: command not found


一、关于 kaldi

Kaldi is a toolkit for speech recognition, intended for use by speech recognition researchers and professionals.

  • 官网 : https://www.kaldi-asr.org
  • Github : https://github.com/kaldi-asr/kaldi
  • 已有的模型:https://www.kaldi-asr.org/models.html
  • 官方文档:https://www.kaldi-asr.org/doc/

参考

  • ubuntu 18.04 安装Kaldi教程(总结安装过程中碰到的坑)
    https://zhuanlan.zhihu.com/p/148524930
  • AssemblyAI / kaldi-install-tutorial
    https://github.com/AssemblyAI/kaldi-install-tutorial/blob/main/setup.sh

二、安装

1、下载源码

你可以从 https://github.com/kaldi-asr/kaldi 直接下载;


也有用户反馈是用这个版本更好:

git clone https://github.com/kaldi-asr/kaldi.git kaldi-trunk --origin golden

网络不好可以在这里下载:https://download.csdn.net/download/lovechris00/87301550


2、查看 INSTALL 文件

root – INSTALL

根目录下的 INSTALL 内容为:

This is the official Kaldi INSTALL. Look also at INSTALL.md for the git mirror installation.
[Option 1 in the following does not apply to native Windows install, see windows/INSTALL or following Option 2]

Option 1 (bash + makefile):
  Steps:
    (1) go to tools/  and follow INSTALL instructions there.
    (2) go to src/ and follow INSTALL instructions there.

Option 2 (cmake):
    Go to cmake/ and follow INSTALL.md instructions there.
    Note, it may not be well tested and some features are missing currently.

tools – INSTALL

tools 下的 INSTALL 文件内容为:

To check the prerequisites for Kaldi, first run

  extras/check_dependencies.sh

and see if there are any system-level installations you need to do. Check the output carefully. There are some things that will make your life a lot easier if you fix them at this stage. If your system default C++ compiler is not supported, you can do the check with another compiler by setting the CXX environment variable, e.g.

  CXX=g++-4.8 extras/check_dependencies.sh

Then run

  make

which by default will install ATLAS headers, OpenFst, SCTK and sph2pipe.
OpenFst requires a relatively recent C++ compiler with C++11 support, e.g.g++ >= 4.7, Apple clang >= 5.0 or LLVM clang >= 3.3.
If your system default compiler does not have adequate support for C++11, you can specify a C++11
compliant compiler as a command argument, e.g.

  make CXX=g++-4.8

If you have multiple CPUs and want to speed things up, you can do a parallel build by supplying the -j option to make, e.g. to use 4 CPUs

  make -j 4

In extras/, there are also various scripts to install extra bits and pieces that are used by individual example scripts. If an example script needs you to run one of those scripts, it will tell you what to do.


src – INSTALL

src 下的 INSTALL 文件内容为:

These instructions are valid for UNIX-like systems (these steps have been run on various Linux distributions; Darwin; Cygwin).
For native Windows compilation, see ../windows/INSTALL .
You must first have completed the installation steps in ../tools/INSTALL (compiling OpenFst; getting ATLAS and CLAPACK headers).
The installation instructions are

  ./configure --shared
  make depend -j 8
  make -j 8

Note that we added the -j 8 to run in parallel because “make” takes a long time. 8 jobs might be too many for a laptop or small desktop machine with not many cores.
For more information, see documentation at http://kaldi-asr.org/doc/ and click on “The build process (how Kaldi is compiled)”.


3、处理tools

从根目录进入 tools 文件夹

cd tools

# 检查
./extras/check_dependencies.sh

如果缺少什么包,这个脚本会提示你安装;
macOS 下使用 brew install xxx 来安装


编译

make -j 4

运行这个脚本,会下载第三方软件包,并自动解压;
如果后续软件安装失败(没有安装、包大小有问题),可以再次执行 make 命令;
没有自动解压的就手动解压一下。


make 第三方包

make openfst
make cub
make sclite
make sph2pipe

后面过程中如果出现报错:you may not have installed OpenFst 一般都是因为这里没有编译好 OpenFst。
参考文章:https://blog.csdn.net/weixin_42103947/article/details/119842650


安装 mkl

linux 可以使用下面命令安装:

./extras/install_mkl.sh

Mac 上执行命令会报错:

./extras/install_mkl.sh: This script can be used on Linux only, and your system is Darwin.
Installer packages for Mac and Windows are available for download from Intel:


你需要前往下面网站下载:
https://software.intel.com/mkl/choose-download

这里我下载的是离线安装包,点击app安装即可。

Linux/macOS 安装 Kaldi_第2张图片


安装 irstlm、kaldi_lm、openblas

sudo ./extras/install_irstlm.sh

sudo ./extras/install_kaldi_lm.sh

sudo ./extras/install_openblas.sh

4、处理 src

src 是和 tools 平行的 src 文件夹
从 tools 切换到 src

cd ../src

./configure --shared
 make depend -j 8
 make -j 8

三、测试

在kaldi目录下

cd egs/yesno/s5
./run.sh

如果得到类似下方结果,代表基本运行成功(kaldi安装成功)

steps/diagnostic/analyze_lats.sh: see stats in exp/mono0a/decode_test_yesno/log/analyze_lattice_depth_stats.log
local/score.sh --cmd utils/run.pl data/test_yesno exp/mono0a/graph_tgpr exp/mono0a/decode_test_yesno
local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0
%WER 0.00 [ 0 / 232, 0 in , 0 del, 0  ub ] exp/mono0a/decode_te t_ye no/wer_10_0.0

报错1:Bad FST header

如果你出现下述报错:

ERROR: FstHeader::Read: Bad FST header: standard input

需要将 openfst bin 目录添加到环境变量;
你也可以添加到 egs/yesno/s5/path.sh

export FST_PATH='/Users/xx/kaldi-trunk/tools/openfst-1.7.2/bin'

然后执行

source path.sh 
./run.sh 

报错1:gmm-init-mono: command not found

run.pl: job failed, log is in exp/mono0a/log/init.log

# gmm-init-mono --shared-phones=data/lang/phones/sets.int "--train-feats=ark,s,cs:apply-cmvn  --utt2spk=ark:data/train_yesno/split1/1/utt2spk scp:data/train_yesno/split1/1/cmvn.scp scp:data/train_yesno/split1/1/feats.scp ark:- | add-deltas  ark:- ark:- | subset-feats --n=10 ark:- ark:-|" data/lang/topo 39 exp/mono0a/0.mdl exp/mono0a/tree 
# Started at Fri Dec 16 20:27:09 CST 2022
#
bash: line 1: gmm-init-mono: command not found
# Accounting: time=0 threads=1
# Ended (code 127) at Fri Dec 16 20:27:09 CST 2022, elapsed time 0 seconds

根据猜测,gmm-init-mono 是个命令工具,但终端找不到他的地址;
经过搜索 kaldi 文件夹,可以发现它位于 src/gmmbin/gmm-init-mono 目录下,那么将这个目录添加到环境变量;
macOS 下是 ~/.bash_profile, linux 下是 ~/.bashrc

export GMMBIN_PATH='/Users/XX/XX/XX/kaldi-trunk/src/gmmbin'

然后继续执行

source ~/.bash_profile
./run.sh 

伊织 2022-12-16(五)

你可能感兴趣的:(ML/DL,软件工具/使用技巧,Kaldi,安装,Kaldi,linux,macOS)