ESPnet

ESPnet_第1张图片


文章目录

    • 关于 ESPnet
    • 安装配置
    • 运行 yesno

关于 ESPnet

  • github: https://github.com/espnet/espnet

ESPnet is an end-to-end speech processing toolkit covering end-to-end speech recognition, text-to-speech, speech translation, speech enhancement, speaker diarization, spoken language understanding, and so on.
ESPnet uses pytorch as a deep learning engine and also follows Kaldi style data processing, feature extraction/format, and recipes to provide a complete setup for various speech processing experiments.

ESPnet 是一个端到端语音处理工具包,包含语音识别、文字转语音、语音翻译、语音增强、说话人识别、口语理解等。
ESPnet 使用 PyTorch 作为深度学习引擎,并遵循 Kaldi 风格数据处理、特征抽取/格式、方案,为各种语音处理实验提供完整的设置。


安装配置

1、下载

git clone https://github.com/espnet/espnet.git


2、设置软链接

cd espnet/tools
ln -s <path to kaldi> .

3、安装依赖包

pip install chainer==6.0.0 cupy-cuda92==6.0.0

espnet/tools下执行check_install.py

python3 check_install.py

4、make

make KALDI=~/xxcode/kaldi PYTHON=~/miniconda3/bin/python CUDA_VERSION=11.3

运行 yesno

进入 espnet/egs/yesno 文件夹,下面有 tts1 和 asr1 文件夹。进入一个,然后执行:

sh run.sh

tts 执行成功后,打印如下:

Succeeded creating wav for test_yesno
Succeeded creating wav for train_dev
Finished.

asr 执行成功后,将打印如下:

2023-01-28 19:57:40,756 (json2trn:46) INFO: reading exp/train_nodev_pytorch_train/decode_test_yesno_decode/data.json
2023-01-28 19:57:40,756 (json2trn:50) INFO: reading data/lang_1char/train_nodev_units.txt
write a CER (or TER) result in exp/train_nodev_pytorch_train/decode_test_yesno_decode/result.txt
       | SPKR   | # Snt # Wrd | Corr    Sub    Del    Ins    Err  S.Err |
       | Sum/Avg|   30    835 | 47.9   51.1    1.0   47.2   99.3  100.0 |
Finished

如果执行失败,如果是某个文件、command 找不到,可以手动查找下。如果有就将其所在文件夹添加到环境变量。如果没有,需要检查下,是否某个步骤没有编译成功。
如果这些都没问题, 可以检查下,Kaldi 是否安装配置成功。
Kaldi 安装配置可参考:https://blog.csdn.net/lovechris00/article/details/128347128


2023-01-28(周六)
初七、开工第一天,伊织祝大家学有所成

你可能感兴趣的:(语音,语音识别,人工智能,ESPnet,端到端)