w2l的github有一个demo
https://github.com/facebookresearch/wav2letter/tree/master/tutorials/1-librispeech_clean
按照demo训练
有一处有变化
wav2letter/tutorials/librispeech_clean/prepare_data.py --src $W2LDIR/LibriSpeech/ --dst $W2LDIR
改为
wav2letter/tutorials/1-librispeech_clean/prepare_data.py --src $W2LDIR/LibriSpeech/ --dst $W2LDIR
但执行的时候还报错。
wav2letter/tutorials/1-librispeech_clean/prepare_data.py: line 22:
Copyright (c) Facebook, Inc. and its affiliates.
All rights reserved.
This source code is licensed under the BSD-style license found in the
LICENSE file in the root directory of this source tree.
---------
Script to package original Mini Librispeech datasets into a form readable in
wav2letter++ pipelines
[If you haven't downloaded the datasets] Please download all the original datasets
in a folder on your own
> wget -qO- http://www.openslr.org/resources/12/train-clean-100.tar.gz | tar xvz
> wget -qO- http://www.openslr.org/resources/12/dev-clean.tar.gz | tar xvz
> wget -qO- http://www.openslr.org/resources/12/test-clean.tar.gz | tar xvz
Command : prepare_data.py --src [...]/LibriSpeech/ --dst [...]
Replace [...] with appropriate paths
: File name too long
from: can't read /var/mail/__future__
提醒我目录太长,因为之前确实因为习惯将目录放在一个比较深的子目录里,所以按照流程,重新设置了目录,运行还报错,经分析,是我系统默认用python2启动运行,所以手动加前缀python3.7,运行就正常了
python3.7 wav2letter/tutorials/1-librispeech_clean/prepare_data.py --src $W2LDIR/LibriSpeech/ --dst $W2LDIR
随后直接在bashrc中修改
alias python='/usr/local/bin/python3.7'
alias pip='/usr/local/bin/pip3.7'
执行训练的时候报错:
wav2letter/build/Train train --flagsfile wav2letter/tutorials/1-librispeech_clean/train.cfg
terminate called after throwing an instance of 'std::runtime_error'
what(): loadSound: unknown format or could not open stream
*** Aborted at 1569812537 (unix time) try "date -d @1569812537" if you are using GNU date ***
PC: @ 0x7f37686eb428 gsignal
*** SIGABRT (@0x3e8000034a6) received by PID 13478 (TID 0x7f377400c800) from PID 13478; stack trace: ***
@ 0x7f3769565390 (unknown)
@ 0x7f37686eb428 gsignal
@ 0x7f37686ed02a abort
@ 0x7f376903084d __gnu_cxx::__verbose_terminate_handler()
@ 0x7f376902e6b6 (unknown)
@ 0x7f376902e701 std::terminate()
@ 0x7f376902e919 __cxa_throw
@ 0x5d1ca1 w2l::loadSound<>()
@ 0x5d1e60 w2l::loadSound<>()
@ 0x5e230a w2l::W2lListFilesDataset::getLoaderData()
@ 0x5d77f4 w2l::W2lDataset::getFeatureData()
@ 0x5d8da9 w2l::W2lDataset::getFeatureDataAndPrefetch()
@ 0x5d90de w2l::W2lDataset::get()
@ 0x4821c4 _ZZ4mainENKUlSt10shared_ptrIN2fl6ModuleEES_IN3w2l17SequenceCriterionEES_INS3_10W2lDatasetEES_INS0_19FirstOrderOptimizerEES9_ddbiE3_clES2_S5_S7_S9_S9_ddbi.constprop.11419
@ 0x41b318 main
@ 0x7f37686d6830 __libc_start_main
@ 0x47d7b9 _start
Aborted
按照默认指导重装
git clone git://github.com/erikd/libsndfile.git
./autogen.sh
./configure --enable-werror
make
make check
sudo make install
执行
sndfile-play /w2l/LibriSpeech/train-clean-100/103/1240/103-1240-0009.flac
声音播放正常。然而还是报错。
https://github.com/facebookresearch/wav2letter/issues/241
https://github.com/facebookresearch/wav2letter/issues/360
都提到了这个问题,但没有提到如何安装Ogg/Opus support的版本。那么在libsndfile工程目录里搜索
grep -rn Ogg
发现README.md文件里
ENABLE_EXTERNAL_LIBS
- enable Ogg, Vorbis and FLAC support. This option isON
if all dependency libraries were found.那么强制设置成ON
cmake … DENABLE_EXTERNAL_LIBS=ON
编译完毕好像还是不行。静下心来分析,提示说的是依赖库都支持的话自动设置为On,所以关键还是找opus这个咚咚。找到官网,下载包,解压缩。
编译安装 ./configure && make && make install
对于libsndfile的安装:
-- Could NOT find Sndio (missing: SNDIO_LIBRARY SNDIO_INCLUDE_DIR)
-- Found Opus: /usr/local/lib/libopus.so (found version "1.3.1")
-- Could NOT find Speex (missing: SPEEX_LIBRARY SPEEX_INCLUDE_DIR)
-- Checking processor clipping capabilities...
-- Checking processor clipping capabilities... none
-- The following features have been disabled:
* BUILD_SHARED_LIBS , build shared libraries
* ENABLE_EXPERIMENTAL , enable experimental code
* ENABLE_CPU_CLIP , Enable tricky cpu specific clipper
* ENABLE_BOW_DOCS , enable black-on-white html docs
-- Configuring done
-- Generating done
-- Build files have been written to: /home/changshengwu/devpath/machinelearning/wav2letter/libsndfile/CMakeBuild
-- The following features have been enabled:
* BUILD_SHARED_LIBS , build shared libraries
* ENABLE_EXTERNAL_LIBS , enable FLAC, Vorbis, and Opus codecs
* BUILD_REGTEST , build regtest
* ENABLE_CPACK , enable CPack support
* ENABLE_PACKAGE_CONFIG , generate and install package config file
然而这个时候居然编译不过去,报了一个“ ‘round@@GLIBC_2.2.5’ error when compling with BUILD_SHARED_LIBS=ON”,这个问题反反复复的没有办法搞定。因为看到autogen.sh和configure的log里python还是直接取/usr/bin/python的方式,得到的python2.7,担心由于python兼容的问题,遂改彻底
sudo mv python python.bak.bak
sudo ln -s /usr/local/bin/python3.7 /usr/bin/python
然而编译还是不成功,github提交了一个issue,然后再仔细看看libsndfile的readme
Configuring CMake
You can pass additional options with /D= when you run cmake command. Some useful system options:
CMAKE_C_FLAGS - additional C compiler flags
CMAKE_BUILD_TYPE - configuration type, DEBUG, RELEASE, RELWITHDEBINFO or MINSIZEREL. DEBUG is default
CMAKE_INSTALL_PREFIX - build install location, the same as --prefix option of configure script
Useful libsndfile options:
BUILD_SHARED_LIBS - build shared library (DLL under Windows) when ON, build static library othervise. This option is ON by default.
BUILD_PROGRAMS - build libsndfile's utilities from programs/ directory, ON by default.
BUILD_EXAMPLES - build examples, ON by default.
BUILD_TESTING - build tests. Then you can run tests with ctest command, ON by default. Setting BUILD_SHARED_LIBS to ON disables this option.
ENABLE_EXTERNAL_LIBS - enable Ogg, Vorbis and FLAC support. This option is available and set to ON if all dependency libraries were found.
ENABLE_CPU_CLIP - enable tricky cpu specific clipper. Enabled and set to ON when CPU clips negative\positive. Don't touch it if you are not sure
ENABLE_BOW_DOCS - enable black-on-white documentation theme, OFF by default.
ENABLE_EXPERIMENTAL - enable experimental code. Don't use it if you are not sure. This option is OFF by default.
ENABLE_CPACK - enable CPack support. This option is ON by default.
ENABLE_PACKAGE_CONFIG - generate and install package config file. This option is ON by default.
ENABLE_STATIC_RUNTIME - enable static runtime on Windows platform, OFF by default.
ENABLE_COMPATIBLE_LIBSNDFILE_NAME - set DLL name to libsndfile-1.dll (canonical name) on Windows platform, sndfile.dll otherwise, OFF by default. Library name can be different depending on platform. The well known DLL name on Windows platform is libsndfile-1.dll, because the only way to build Windows library before was MinGW toolchain with Autotools. This name is native for MinGW ecosystem, Autotools constructs it using MinGW platform rules from sndfile target. But when you build with CMake using native Windows compiler, the name is sndfile.dll. This is name for native Windows platform, because Windows has no library naming rules. It is preffered because you can search library using package manager or CMake's find_library command on any platform using the same sndfile name.
Deprecated options:
DISABLE_EXTERNAL_LIBS - disable Ogg, Vorbis and FLAC support. Replaced by ENABLE_EXTERNAL_LIBS
DISABLE_CPU_CLIP - disable tricky cpu specific clipper. Replaced by ENABLE_CPU_CLIP
BUILD_STATIC_LIBS - build static library. Use BUILD_SHARED_LIBS instead
再次检查配置文件BUILD_SHARED_LIBS = ON其实是给windows用的,有检查/usr/local/lib和/usr/local/include,对应的libsndfile的so和头文件都在。这时候感觉很抓狂。
接下来决定重新配置和编译一下wav2letter。cmake之后的log信息提示都正常。
-- Found ZLIB: /usr/lib/x86_64-linux-gnu/libz.so (found version "1.2.8")
-- Found Ogg: /usr/lib/x86_64-linux-gnu/libogg.so (found version "1.3.2")
-- Required SndFile dependency Ogg found.
-- Found Vorbis: /usr/lib/x86_64-linux-gnu/libvorbis.so (found version "1.3.5")
-- Required SndFile dependency Vorbis found.
-- Found VorbisEnc: /usr/lib/x86_64-linux-gnu/libvorbisenc.so (found version "1.3.5")
-- Required SndFile dependency VorbisEnc found.
-- Found FLAC: /usr/lib/x86_64-linux-gnu/libFLAC.so (found version "1.3.1")
-- Required SndFile dependency FLAC found.
-- Found SNDFILE: /usr/local/include
-- Found libsndfile: (lib: /usr/local/lib/libsndfile.so include: /usr/local/include
-- libsndfile found.
这回编译完毕,train已经不再报错,终于顺利地跑下去了。
训练过程没有任何提示,在CPU模式下,我的电脑大约跑了40小时左右。最后在训练目录librispeech_clean_trainlogs下生成
001_config
001_log
001_model_last.bin
001_model_lists#dev-clean.lst.bin
001_perf
几种文件,而从decode.cfg来看 001_model_lists#dev-clean.lst.bin是用于解码的,这应该就是我们的训练结果。
按照tutorial继续运行decode:
./wav2letter/build/Decoder --flagsfile wav2letter/tutorials/1-librispeech_clean/decode.cfg
这个版本的log及其丰富,而且刚开始出现了很多这样的提示
Falling back to using letters as targets for the unknown word: valleyed
Falling back to using letters as targets for the unknown word: woodbegirt
Falling back to using letters as targets for the unknown word: citadelled
Falling back to using letters as targets for the unknown word: dedalos
Falling back to using letters as targets for the unknown word: hazewrapped
Falling back to using letters as targets for the unknown word: chiaroscurists
Falling back to using letters as targets for the unknown word: chiaroscurist
Falling back to using letters as targets for the unknown word: crampness
Falling back to using letters as targets for the unknown word: angor
Falling back to using letters as targets for the unknown word: greeing
Falling back to using letters as targets for the unknown word: million'd
Falling back to using letters as targets for the unknown word: sharp'st
.......
Skipping unknown entry: 'semon's'
Skipping unknown entry: 'battleax'
Falling back to using letters as targets for the unknown word: andella
Falling back to using letters as targets for the unknown word: andella
Skipping unknown entry: 'andella'
Skipping unknown entry: 'andella'
Falling back to using letters as targets for the unknown word: kaffar's
Skipping unknown entry: 'kaffar's'
Falling back to using letters as targets for the unknown word: thel
....
|T|: and henry might return to england at any moment
|P|: and henry might return to england at any moment
[sample: test-clean-61-70968-0048, WER: 0%, LER: 0%, slice WER: 17.9706%, slice LER: 8.60217%, progress (slice 1): 99.542%]
|T|: i love thee with a love i seemed to lose with my lost saints i love thee with the breath smiles tears of all my life and if god choose i shall but love thee better after death
|P|: i love thee with a love i seemed to lose with my lost saints i love thee with the breath smiles tears of all my length and if god choose i shall but love the better after death
[sample: test-clean-908-31957-0025, WER: 5.26316%, LER: 3.42857%, slice WER: 18.705%, slice LER: 8.92016%, progress (slice 3): 98.0153%]
|T|: ain't they the greatest
|P|: and then the gratis
[sample: test-clean-4992-41806-0010, WER: 75%, LER: 30.4348%, slice WER: 17.9879%, slice LER: 8.60928%, progress (slice 1): 99.6947%]
|T|: on she hurried until sweeping down to the lagoon and the island lo the cotton lay before her
|P|: on she hurried until sweeping down to the lagoon and the island lo the cotton lay before her
[sample: test-clean-1995-1837-0027, WER: 0%, LER: 0%, slice WER: 17.9634%, slice LER: 8.59807%, progress (slice 1): 99.8473%]
|T|: the christmas holidays came and she and anne returned to the parsonage and to that happy home circle in which alone their natures expanded amongst all other people they shrivelled up more or less
|P|: the christmas holidays came and she and ann returned to the parsonage and that happy home circle in which alone their natures expanded amongst all other people that from all that more lush
[sample: test-clean-3575-170457-0039, WER: 23.5294%, LER: 11.2821%, slice WER: 18.7173%, slice LER: 8.9266%, progress (slice 3): 98.1679%]
|T|: in strict accuracy nothing should be included under the head of conspicuous waste but such expenditure as is incurred on the ground of an invidious pecuniary comparison
|P|: in strict accuracy nothing should be included under the head of conspicuous waste but such expenditure as is in court on the ground of an envious pecuniary comparison
[sample: test-clean-3570-5696-0009, WER: 11.1111%, LER: 4.7619%, slice WER: 17.9495%, slice LER: 8.58897%, progress (slice 1): 100%]
因为我之前测试flashlight有很多错误,所以又不自信了,于是我取消了decode过程,折腾了半天的arrayfire和flashlight,最后又跑了一遍。但上述log应该是正常的,T表示Test测试集,P表示Hyp估计的结果,sample表示这条case的WER和LER,slice WER和slice LER表示总测试的结果。
还有一个方法,进入测试的runtime目录,会看到三个文件:
-rw-rw-r-- 1352708 10月 8 15:54 lists#test-clean.lst.hyp
-rw-rw-r-- 1 941511 10月 8 15:54 lists#test-clean.lst.log
-rw-rw-r-- 1 359475 10月 8 15:54 lists#test-clean.lst.ref
这三个文件分别表示了测试集的内容,解码的内容和测试过程的log。在log文件的最后一行,给出了整个测试结果
[Decode lists/test-clean.lst (2620 samples) in 182.909s (actual decoding time 0.275s/sample) -- WER: 18.5008, LER: 8.82783]
flashlight make test错误如下:
Start 1: AutogradTest
1/11 Test #1: AutogradTest .....................***Failed 70.15 sec
Start 2: OptimTest
2/11 Test #2: OptimTest ........................***Failed 0.04 sec
Start 3: ModuleTest
3/11 Test #3: ModuleTest .......................***Failed 1.01 sec
Start 4: SerializationTest
4/11 Test #4: SerializationTest ................***Failed 2.79 sec
Start 5: NNUtilsTest
5/11 Test #5: NNUtilsTest ...................... Passed 0.04 sec
Start 6: DatasetTest
6/11 Test #6: DatasetTest ...................... Passed 0.84 sec
Start 7: DatasetUtilsTest
7/11 Test #7: DatasetUtilsTest ................. Passed 0.01 sec
Start 8: MeterTest
8/11 Test #8: MeterTest ........................ Passed 0.03 sec
Start 9: AllReduceTest
9/11 Test #9: AllReduceTest ....................***Exception: Other 0.51 sec
Start 10: ContribModuleTest
10/11 Test #10: ContribModuleTest ................***Failed 0.17 sec
Start 11: ContribSerializationTest
11/11 Test #11: ContribSerializationTest ......... Passed 0.04 sec
45% tests passed, 6 tests failed out of 11
Total Test time (real) = 75.65 sec
The following tests FAILED:
1 - AutogradTest (Failed)
2 - OptimTest (Failed)
3 - ModuleTest (Failed)
4 - SerializationTest (Failed)
9 - AllReduceTest (OTHER_FAULT)
10 - ContribModuleTest (Failed)
Errors while running CTest
Makefile:71: recipe for target 'test' failed
刚开始不知道咋办,就一级级的找原因,发现可进入build/test目录独立进行测试
-rwxrwxr-x 1 2883824 10月 8 14:35 AllReduceTest*
-rwxrwxr-x 1 3202496 10月 8 14:35 AutogradTest*
drwxrwxr-x 14 4096 10月 8 14:23 CMakeFiles/
-rw-rw-r-- 1 1032 10月 8 14:22 cmake_install.cmake
-rwxrwxr-x 1 2813208 10月 8 14:29 ContribModuleTest*
-rwxrwxr-x 1 2804752 10月 8 14:29 ContribSerializationTest*
-rw-rw-r-- 1 811 10月 8 14:22 CTestTestfile.cmake
-rwxrwxr-x 1 2947960 10月 8 14:34 DatasetTest*
-rwxrwxr-x 1 2866632 10月 8 14:29 DatasetUtilsTest*
drwxrwxr-x 4 4096 10月 8 14:22 googletest/
-rw-rw-r-- 1 489359 10月 8 14:22 Makefile
-rwxrwxr-x 1 2892104 10月 8 14:32 MeterTest*
-rwxrwxr-x 1 2856544 10月 8 14:29 ModuleTest*
-rwxrwxr-x 1 2798672 10月 8 14:35 NNUtilsTest*
-rwxrwxr-x 1 2861760 10月 8 14:32 OptimTest*
-rwxrwxr-x 1 2938736 10月 8 14:31 SerializationTest*
于是将AutogradTest单独运行,这样出来了详细的log
Value of: jacobianTestImpl(func_weightNorm_in, in, 1E-1)
Actual: false
Expected: true
[ FAILED ] AutogradTest.WeightNormConv (2669 ms)
[ RUN ] AutogradTest.Rnn
unknown file: Failure
C++ exception with description "rnn not yet implemented on CPU" thrown in the test body.
[ FAILED ] AutogradTest.Rnn (0 ms)
[ RUN ] AutogradTest.Lstm
unknown file: Failure
C++ exception with description "rnn not yet implemented on CPU" thrown in the test body.
[ FAILED ] AutogradTest.Lstm (0 ms)
[ RUN ] AutogradTest.Gru
unknown file: Failure
C++ exception with description "rnn not yet implemented on CPU" thrown in the test body.
[ FAILED ] AutogradTest.Gru (0 ms)
原来rnn是不能在目前的cpu平台支持的,不过w2l运行的是cnn,所以对我的没关系。这个过程还因为arrayfire安装了spdlog,总之各种折腾,终于跑完了第一次的流程。