TTS(Text to Speech)技术是将相应的文本转化为语音,使机器能够开口“说话”。它将计算机自己产生的、或外部输入的文字信息转变为可以听得懂的、流利的汉语口语(或者其他语言语音)输出的技术,隶属于语音合成(SpeechSynthesis)。
本文通过移植ekho来实现tts技术,Ekho由国人所创,现在到8.3版本。Ekho是一个TTS引擎,。Ekho(余音)是一个免费、开源的中文语音合成软件。它目前支持粤语、普通话(国语)、诏安客语、藏语、雅言(中国古代通用语)和韩语(试验中),英文则通过Festival间接实现。Ekho支持Linux、Windows和Android平台。
由于本文移植的ekho增加了festival和speechd支持,能够准确读出英文、初步实现语法语义分析。能识别大部分多音字成语、正确读出年月日、区分一二八和一百二十八等。并且由于ekho采用基于人录音的语料库的声音较其他大部分开源TTS引擎都平滑自然。接下来开始进行移植工作
先贴上两个参考链接:
Linux声音解决方案与TTS引擎
ekho-7.7.1嵌入式Linux移植全过程
注意:所有ekho相关的依赖包最好放在同一父目录下。
1.移植espeak-ng-1.50
./configure --prefix=/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr --host=aarch64-buildroot-linux-gnu --target=aarch64-linux CC=aarch64-linux-gcc LDFLAGS=-L/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib CPPFLAGS=-I/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include
make && make install
espeak有一点没整明白,就是在ekho-ng-data目录下不会产生phontab文件。开始是先从espeak-1.48这个版本也就是ekho嵌入式移植全过程编译espeak在espeak-data目录下的音素文件拷贝到ekho-ng-data目录下,编译ekho-8.3也能正常播放汉语,但英文不行。后面查了资料发现espeak和espeak-ng的音素文件已经不兼容了,这可能是导致了ekho-8.3在增加了festival支持后依然不能播放英语的原因。不得已最后移植了ekho8.0版本。如果不需要移植ekho8.3版本的,可以跳过espeak-ng的移植。接下来是ekho-8.0的移植步骤,ekho-8.0版本还是依赖于espeak。
2.移植dotconf-1.3
./configure --prefix=/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr --host=aarch64-linux LDFLAGS=-L/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib CFLAGS=-I/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include CC=aarch64-linux-gcc --with-sysroot=/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot
make && make install
dotconf移植成功.
3.移植speech-dispatcher-0.8.8
./configure --prefix=/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr --host=aarch64-linux LDFLAGS=-L/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib CFLAGS=-I/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include CC=aarch64-linux-gcc --without-pulse --without-alsa --without-espeak-ng
出现错误:
say.c:(.text+0xc30): undefined reference to `rpl_malloc'
../../../src/api/c/.libs/libspeechd.so: undefined reference to `rpl_realloc'
注释掉当前目录下的config.h文件227行、221行
4.移植speech_tools
./configure --prefix=/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr --host=aarch64-buildroot-linux-gnu --target=aarch64-buildroot-linux-gnu CC=aarch64-buildroot-linux-gnu-gcc LDFLAGS=-L/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib CPPFLAGS=-I/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include
在speech_tools目录下:
Vim config/config +64取消SHARED=2的注释修改为支持动态编译,speech_tools不编译动态库后面编译ekho会失败,如图
cd config/compilers
vim gcc_defaults.mak
修改交叉编译器如下图所示
否则在编译过程会出现报错
linux_sound.cc:395:10: 致命错误:alsa/asoundlib.h:没有那个文件或目录
这是由于默认的编译还是gcc,无法链接到/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr目录
在speech_tools目录下:
cd config/rules
gedit library.mak +110,注释掉该行
make && make install完成安装后进入目录
cd lib
cp libest*.a /opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib
将speech_tools/lib目录下生成的静态库拷贝到sysroot目录,后续编译ekho会用到这些库
5.移植festival
/configure --prefix=/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/ --host=aarch64-linux
修改Makefile文件,修改后的Makefile文件如下所示
make && make install完成安装
6.移植ibmtts-sdk-6.7.4
./autogen.sh
./configure --prefix=/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr --host=aarch64-buildroot-linux-gnu --target=aarch64-buildroot-linux-gnu CC=aarch64-buildroot-linux-gnu-gcc LDFLAGS=-L/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib CPPFLAGS=-I/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include
修改Makefile
make clean
make 2>&1 | tee make.errs
make install
有报错也不用管,安装后在ibmtts-sdk-6.7.4/.libs目录下找得到正确的动态库就行,然后将这些动态库拷到/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib目录下
7.移植ekho-8.0
本来想移植ekho-8.3的,但ekho-8.3依赖于espeak-ng,非常不好搞。所以最终选择了ekho8.0
修改configure文件
gedit configure +17822,修改改行为
LIBS="-lestools -lestbase -leststring -lncurses -lasound -ldl -lm -lstdc++ -lgomp
如图
configure文件17897行,修改festival链接动态库,否则编译时configure时会不能支持festival。修改为如下截图所示
configure文件17897行,修改为链接动态库、头文件目录等,否则编译时会出现函数未定义的情况。修改为如下截图所示
注意:${prefix}目录是ekho的安装目录,同时也是交叉编译器的安装目录,即sysroot目录。修改完毕运行如下命令:
./configure --prefix=/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr --host=aarch64-linux --target=aarchaarch64-linux CC=aarch64-linux-gcc --enable-festival --enable-speechd --enable-shared LDFLAGS=-L/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib CPPFLAGS=-I/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include --with-sysroot=/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr LT_SYS_LIBRARY_PATH=/usr/lib
configure执行成功的截图如上,如果有一个是no,也可能导致ekho编译失败或者运行时出现其他一些问题。
修改Makefile
修改libmusicxml/samples下的Makefile中的gcc修改为交叉编译工具链,如图
cd libmusicxml/linux
make clean
make CC=aarch64-linux-gcc CXX=aarch64-linux-g++
这里就是修改speechd-api的configure选项,修改ekho-8.0/Makefile 指令如下:
./configure --prefix=/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr --host=aarch64-linux CC=aarch64-linux-gcc CXX=aarch64-linux-g++ --with-pulse --with-alsa --with-espeak LDFLAGS=-L/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib CFLAGS=-I/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include
如下截图所示
修改完成先尝试一下make
出现报错:
In file included from /usr/include/stdlib.h:55:0,
from /opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include/alsa/asoundlib.h:33,
from spd_audio.h:37,
from spd_audio.c:33:
/usr/include/bits/floatn.h:86:9: 错误:未知的类型名‘__float128’
typedef __float128 _Float128;
^~~~~~~~~~
In file included from spd_audio.c:65:0:
pulse.c: 在函数‘pulse_open’中:
pulse.c:132:39: 警告:将一个指针转换为大小不同的整数 [-Wpointer-to-int-cast]
id->pa_min_audio_length = pars[1]?(int)pars[1] : DEFAULT_PA_MIN_AUDIO_LENgTH;
^
make[4]: *** [Makefile:475:spd_audio.lo] 错误 1
make[4]: 离开目录“/ekho-8.0/speechd-api/src/audio”
make[3]: *** [Makefile:402:all-recursive] 错误 1
make[3]: 离开目录“/ekho-8.0/speechd-api/src”
make[2]: *** [Makefile:443:all-recursive] 错误 1
make[2]: 离开目录“/ekho-8.0/speechd-api”
make[1]: *** [Makefile:375:all] 错误 2
make[1]: 离开目录“/ekho-8.0/speechd-api”
cd ekho-8.0/speechd-api/src/audio
gedit Makefile.am 将原-I/usr/include修改为-I/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include,如下图所示,这是将头文件目录指向交叉编译器sysroot目录
继续make应该还会出现几次一样的错误,报错后进入该目录按如上方法修改即可
出现报错:
mv -f .deps/ekho.Tpo .deps/ekho.Po
make[4]: *** 没有规则可制作目标“../../../libekho.a”,由“sd_ekho” 需求。 停止。
cd ekho-8.0/speechd-api/src/modules注释掉Makefile.am的第20行
继续make
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_hw_params_any'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_set_params'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_hw_params_set_rate_near'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_hw_params_set_channels'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_open'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_prepare'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_drain'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestools.so: undefined reference to `GOMP_barrier'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestools.so: undefined reference to `GOMP_parallel'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_hw_params_sizeof'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_close'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_hw_params'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_drop'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_hw_params_set_access'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_resume'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestools.so: undefined reference to `omp_get_thread_num'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_state'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_wait'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestools.so: undefined reference to `omp_get_num_threads'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_strerror'
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/lib/libestbase.so: undefined reference to `snd_pcm_hw_params_set_format'
出现一堆未定义
修改ekho-8.0/Makefile文件如图
make && make isntall,至此可以完成安装
以下是可能出现的报错:
出现报错:
In file included from src/libekho_impl.cpp:63:0:
/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include/festival/festival.h:47:17: 致命错误:EST.h:没有那个文件或目录
EST.h可以在speech_tools/include目录找到,在该目录下执行如下指令:
cd speech_tools/include
cp speech_tools/include/* /opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include/
---------------------------------------------------------------------------------------------------------------------------------------
也有可能会出现找不到festival/festival.h头文件的报错的情况,这时候需要进入到festival目录找到festival.h等该目录下的所有头文件
拷贝到/opt/toolchain/host/aarch64-buildroot-linux-gnu/sysroot/usr/include/festival目录下
---------------------------------------------------------------------------------------------------------------------------------------
出现报错:
./config.h:141:16: 错误:‘std::rpl_malloc’尚未声明
#define malloc rpl_malloc
在ekho-8.3目录
vim config.h +141
注释
// #define malloc rpl_malloc
出现报错
/opt/toolchain/host/lib/gcc/aarch64-buildroot-linux-gnu/6.5.0/../../../../aarch64-buildroot-linux-gnu/bin/ld: cannot find -lvorbisenc
/opt/toolchain/host/lib/gcc/aarch64-buildroot-linux-gnu/6.5.0/../../../../aarch64-buildroot-linux-gnu/bin/ld: cannot find -lvorbis
在buildroot选中libvorbis重新编译
---------------------------------------------------------------------------------------------------------------------------------------
ekho -p 0 -a 0 -f 1.txt
正确播放了中文、英文、字母、数字、以及年月日,能够识别大部分多音字。
1.txt的内容如下:
腾讯大厦11层ABC hello world 6月11号2020年
银行 行走 187 hello world 七上八下 斗志昂扬 才高八斗 随声附和 风和日丽 济济一堂 同舟共济 叹为观止 为虎作伥