最近项目需要,需要调研几家语音识别(离线/在线),语义理解,TTS(在线/离线),离线命令词,甚至百度的UNIT上下文使用等,虽然看的不怎么深入,但是也确实调研的不是,主要有百度,科大讯飞,搜狗,云知声,奇梦者等几家,还有包括硬件(科大的四麦直线麦克风,五麦环绕麦克风,最后升级的六麦环绕麦克风;奇梦者的四麦环绕麦克风+柱形麦克风阵列等)。也写了很多Demo,每一家的特色都不同。这里就不一一列举了。以后有机会在慢慢写下来,这里就简单给一个百度在线ASR和云知声离线TTS结合的案列,实现一个复读机的功能。
因为现在比较晚了,我先上代码,后面会补充几点注意事项。
1,准备工作:
(1)百度SDK的JAR包bdasr_V3_xxx_xxx.jar与云知声的JAR包usc.jar:
百度SDK下载
云知声SDK下载
(先注册登录,选择服务再下载)
解压找到工程libs中的对应的jar(不在详说),添加在自己新建的工程的libs目录下,添加为library。
(2) so文件:同理,在刚刚下载的SDK中找到对应的so文件,然后早项目新建一个jniLibs目录(../src/main/)
(3)云知声离线TTS需要语音包(发不同音色使用的):使用时,需要先运行自己编写的工程,然后在对应设备的app安装的目录下新建tts文件夹,把下面三个文件放进去:
backend_female
backend_lzl
frontend_model
2,代码环节:
(1)AndroidManifest.xml:主要就是注意几个权限和百度的APP_ID,API_KEY,SECRET_KEY,这个需要自己创建应用,获取这三个值;这步不在累述。
(2)ASRService:这里我把百度语音识别写成了服务的形式,并开放了AIDL接口,别的应用可以直接调用即可(里面的方法未完善,学习者可以自行添加和修改)
package aoto.com.baidurecongdemo;
import android.app.Service;
import android.content.Intent;
import android.os.IBinder;
import android.os.Message;
import android.os.RemoteCallbackList;
import android.os.RemoteException;
import android.util.Log;
import com.baidu.speech.EventListener;
import com.baidu.speech.EventManager;
import com.baidu.speech.EventManagerFactory;
import com.baidu.speech.asr.SpeechConstant;
import com.unisound.client.SpeechConstants;
import com.unisound.client.SpeechSynthesizer;
import com.unisound.client.SpeechSynthesizerListener;
import org.json.JSONObject;
import java.io.File;
import java.util.LinkedHashMap;
import java.util.Map;
public class ASRService extends Service implements EventListener {
private EventManager asr;
public static String result;
private boolean logTime = true;
private String language;
long startTime = 0;
long endTime = 0;
private MainActivity mainActivity = new MainActivity();
private SpeechSynthesizer mTTSPlayer;
private final String mFrontendModel= "/sdcard/unisound/tts/frontend_model";
private final String mBackendModel = "/sdcard/unisound/tts/backend_lzl";
//主动发数据
final RemoteCallbackList mCallbacks
= new RemoteCallbackList();
private int mValue = 0;
private static final int REPORT_MSG = 1;
@Override
public void onCreate() {
System.out.println("服务创建了:" + "onCreate");
asr = EventManagerFactory.create(this, "asr");
asr.registerListener(this);
super.onCreate();
//初始化离线引擎
initTts();
}
@Override
public void onStart(Intent intent, int startId) {
System.out.println("服务开启了:" + "onStart");
super.onStart(intent, startId);
}
/**
* 测试参数填在这里
*/
public void start() {
Map params = new LinkedHashMap();
String event = null;
event = SpeechConstant.ASR_START; // 替换成测试的event
params.put(SpeechConstant.ACCEPT_AUDIO_VOLUME, false);
//判断调用者传过来的语言类型
if (MainActivity.languge_id == 1) {
params.put(SpeechConstant.PID, 1736);
} else if (MainActivity.languge_id == 2) {
params.put(SpeechConstant.PID, 1836);
} else if (MainActivity.languge_id == 3) {
params.put(SpeechConstant.PID, 1636);
} else {
params.put(SpeechConstant.PID, 1536);
}
System.out.println("MMMMMMMMMMMMMMMMMMLanguage:" + MainActivity.languge_id);
// 请先使用如‘在线识别’界面测试和生成识别参数。 params同ActivityRecog类中myRecognizer.start(params);
String json = null; //可以替换成自己的json
json = new JSONObject(params).toString(); // 这里可以替换成你需要测试的json
asr.send(event, json, null, 0, 0);
printLog("输入参数:" + json);
}
private void stop() {
asr.send(SpeechConstant.ASR_STOP, null, null, 0, 0); //
}
// EventListener 回调方法
/**
* @param name :当前asr状态
* @param params :解析结果字符串
* @author why
*/
@Override
public void onEvent(String name, String params, byte[] data, int offset, int length) {
String logTxt = "name: " + name;
if (params != null && !params.isEmpty()) {
logTxt += " ;params :" + params;
//通过name来确定当前状态,从而获取解析结果显示
if (name.equals("asr.partial")) {
String start = "[";
String end = "]";
//截取解析json字符串中结果显示
result = StringTools.SubStringTwoChar(params, start, end);
startTime = System.currentTimeMillis();
}
}
if (name.equals("asr.finish")) {
//识别结束
if(MainActivity.btn.getText().toString().equals("停止")){
if (result.equals("")||result==null){
result="没有识别到声音";
}
MainActivity.btn.setText("开始");
}
else{
if (result.equals("")||result==null){
result="No sound at all";
}
MainActivity.btn.setText("start");
}
MainActivity.txtResult.setText(result);
//TTS语音合成
mTTSPlayer.playText(result);
}
//会话结束标识,发送广播传数据
if (name.equals("asr.exit")) {
mainActivity.test(result);
}
if (name.equals(SpeechConstant.CALLBACK_EVENT_ASR_PARTIAL)) {
if (params.contains("\"nlu_result\"")) {
if (length > 0 && data.length > 0) {
logTxt += ", 语义解析结果:" + new String(data, offset, length);
}
}
} else if (data != null) {
logTxt += " ;data length=" + data.length;
}
//打印识别日志
printLog(logTxt);
}
//打印识别日志实现
private void printLog(String text) {
if (logTime) {
text += " ;time=" + System.currentTimeMillis();
}
text += "\n";
Log.i(getClass().getName(), text);
}
//返回可调用服务方法的接口
@Override
public IBinder onBind(Intent intent) {
System.out.println("MMMMMMMMMMMMMMMMM" + "服务绑定了!");
return new MyASRresultAIDL();
}
//百度语音识别
class MyASRresultAIDL extends ASRresultAIDL.Stub {
@Override
public String getASRResult() throws RemoteException {
return result;
}
@Override
public void callStart() throws RemoteException {
start();
}
@Override
public void callStop() throws RemoteException {
stop();
}
}
//初始化离线语音合成引擎实现
private void initTts() {
// 初始化语音合成对象
//mTTSPlayer = new SpeechSynthesizer(this, Config.appKey, Config.secret);
mTTSPlayer = new SpeechSynthesizer(this, "", "");
System.out.println("TTTTTTTTTTTTTTTTTTS:"+mTTSPlayer);
// 设置本地合成
mTTSPlayer.setOption(SpeechConstants.TTS_SERVICE_MODE, SpeechConstants.TTS_SERVICE_MODE_LOCAL);
File _FrontendModelFile = new File(mFrontendModel);
if (!_FrontendModelFile.exists()) {
System.out.println("文件:" + mFrontendModel + "不存在,请将assets下相关文件拷贝到SD卡指定目录!");
}
File _BackendModelFile = new File(mBackendModel);
if (!_BackendModelFile.exists()) {
System.out.println("文件:" + mBackendModel + "不存在,请将assets下相关文件拷贝到SD卡指定目录!");
}
// 设置前端模型
mTTSPlayer.setOption(SpeechConstants.TTS_KEY_FRONTEND_MODEL_PATH, mFrontendModel);
// 设置后端模型
mTTSPlayer.setOption(SpeechConstants.TTS_KEY_BACKEND_MODEL_PATH, mBackendModel);
// 设置回调监听
mTTSPlayer.setTTSListener(new SpeechSynthesizerListener() {
@Override
public void onEvent(int type) {
switch (type) {
case SpeechConstants.TTS_EVENT_INIT:
// 初始化成功回调
Log.i("", "回调成功 ");
break;
case SpeechConstants.TTS_EVENT_SYNTHESIZER_START:
// 开始合成回调
Log.i("", "开始同步");
break;
case SpeechConstants.TTS_EVENT_SYNTHESIZER_END:
// 合成结束回调
Log.i("", "结束同步");
break;
case SpeechConstants.TTS_EVENT_BUFFER_BEGIN:
// 开始缓存回调
Log.i("", "开始缓存 ");
break;
case SpeechConstants.TTS_EVENT_BUFFER_READY:
// 缓存完毕回调
Log.i("", "准备缓存 ");
break;
case SpeechConstants.TTS_EVENT_PLAYING_START:
// 开始播放回调
Log.i("", "播放开始 ");
break;
case SpeechConstants.TTS_EVENT_PLAYING_END:
// 播放完成回调
Log.i("", "播放结束 ");
break;
case SpeechConstants.TTS_EVENT_PAUSE:
// 暂停回调
Log.i("", "暂停 ");
break;
case SpeechConstants.TTS_EVENT_RESUME:
// 恢复回调
//log_i("resume");
break;
case SpeechConstants.TTS_EVENT_STOP:
// 停止回调
Log.i("", "停止");
break;
case SpeechConstants.TTS_EVENT_RELEASE:
// 释放资源回调
Log.i("", "释放资源");
break;
default:
break;
}
}
@Override
public void onError(int type, String errorMSG) {
// 语音合成错误回调
Log.e("","Error");
}
});
// 初始化合成引擎
int returnvalue= mTTSPlayer.init("");
System.out.println("EEEEEEEEEEE:"+"初始化引擎"+returnvalue);
}
}
(3)StringUtils : ASRService中使用的一个工具类
package aoto.com.baidurecongdemo;
/**
* author:why
* created on: 2018/1/31 15:25
* description:
*/
public class StringTools {
//截取固定字符床中指定两个字符或者字符串之间的子字符串
/**
*
* @param target :要处理字符串
* @param start :起始字符串/字符
* @param end :结束字符串/字符
* @return
*/
public static String SubStringTwoChar(String target, String start, String end) {
int startIndex = target.indexOf(start);
int endIndex = target.indexOf(end);
return target.substring(startIndex, endIndex).substring(start.length());
}
}
(4)MainActivity:调用服务,演示效果
package aoto.com.baidurecongdemo;
import android.Manifest;
import android.content.ComponentName;
import android.content.Intent;
import android.content.ServiceConnection;
import android.content.pm.PackageManager;
import android.os.IBinder;
import android.os.RemoteException;
import android.support.v4.app.ActivityCompat;
import android.support.v4.content.ContextCompat;
import android.support.v7.app.AppCompatActivity;
import android.os.Bundle;
import android.util.Log;
import android.view.View;
import android.widget.AbsSpinner;
import android.widget.AdapterView;
import android.widget.ArrayAdapter;
import android.widget.Button;
import android.widget.Spinner;
import android.widget.TextView;
import com.baidu.speech.asr.SpeechConstant;
/**
* @author why
*/
public class MainActivity extends AppCompatActivity implements InterfaceTest{
protected TextView txtLog;
public static TextView txtResult;
public static Button btn;
private String result;
private static String DESC_TEXT = "在线识别日志";
public static int languge_id = 0;
private Intent intent;
private MyServiceConnection connection;
private ASRresultAIDL asRresultAIDL;
private Spinner spinner;
private String butContext;
/**
* 测试参数填在这里
*/
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.common_mini);
initView();
intent = new Intent(this, ASRService.class);
connection = new MyServiceConnection();
bindService(intent, connection, BIND_AUTO_CREATE);
}
//触发语音识别开始与停止
public void startOrEnd(View view) {
butContext = btn.getText().toString();
System.out.println("BBBBBBBBBBBBBBBBBBBBBB"+butContext);
//先判断选择的语言
if (butContext.equals("开始") || butContext.equals("start")) {
//识别结果制空
try {
asRresultAIDL.callStart();
} catch (RemoteException e) {
e.printStackTrace();
}
if (butContext.equals("开始")) {
ASRService.result="没有识别到声音";
txtResult.setText("请说,我在听");
btn.setText("停止");
} else {
ASRService.result="No sound at all";
txtResult.setText("Please say,I am listening");
btn.setText("stop");
}
} else {
if (butContext.equals("停止")) {
btn.setText("开始");
try {
txtResult.setText("识别结果:" + asRresultAIDL.getASRResult());
} catch (RemoteException e) {
e.printStackTrace();
}
} else {
btn.setText("start");
try {
txtResult.setText("ASR result:" + asRresultAIDL.getASRResult());
} catch (RemoteException e) {
e.printStackTrace();
}
}
try {
asRresultAIDL.callStop();
} catch (RemoteException e) {
e.printStackTrace();
}
}
}
@Override
protected void onDestroy() {
unbindService(connection);
super.onDestroy();
}
private void initView() {
txtResult = (TextView) findViewById(R.id.txtResult);
txtLog = (TextView) findViewById(R.id.txtLog);
btn = (Button) findViewById(R.id.btn);
txtLog.setText(DESC_TEXT + "\n");
spinner = findViewById(R.id.languge_list);
//下拉框实现
ArrayAdapter adapter = ArrayAdapter.createFromResource(this, R.array.Data, android.R.layout.simple_spinner_item);
adapter.setDropDownViewResource(android.R.layout.simple_spinner_dropdown_item);
spinner.setAdapter(adapter);
spinner.setPrompt("选择语言");
//设置下拉框监听
spinner.setOnItemSelectedListener(new AdapterView.OnItemSelectedListener() {
@Override
public void onItemSelected(AdapterView> parent, View view, int position, long id) {
languge_id = position;
System.out.println("QQQQQQQQQQQQQQQQQQQQQ"+languge_id);
switch (languge_id) {
//普通话
case 0:
txtResult.setText("点击‘开始’");
btn.setText("开始");
break;
//英语
case 1:
//interfaceBaiduParams.setLanguage("English");
txtResult.setText("click 'start'");
btn.setText("start");
break;
// 四川话
case 2:
//interfaceBaiduParams.setLanguage("SiChuanHua");
txtResult.setText("点击‘开始’");
btn.setText("开始");
break;
//粤语
case 3:
//interfaceBaiduParams.setLanguage("YueYu");
txtResult.setText("点击‘开始’");
btn.setText("开始");
break;
}
}
@Override
public void onNothingSelected(AdapterView> parent) {
}
});
}
@Override
public void test(String str) {
}
private class MyServiceConnection implements ServiceConnection {
@Override
public void onServiceConnected(ComponentName componentName, IBinder iBinder) {
asRresultAIDL = ASRresultAIDL.Stub.asInterface(iBinder);
System.out.println("绑定服务成功了");
}
@Override
public void onServiceDisconnected(ComponentName componentName) {
System.out.println("解绑服务成功了");
}
}
}
(5)ASRresultAIDL.aidl:开放服务里面的方法,供其他应用调用
// ASRresultAIDL.aidl
package aoto.com.baidurecongdemo;
// Declare any non-default types here with import statements
interface ASRresultAIDL {
String getASRResult();
void callStart();
void callStop();
}
(6)build.gradle:
apply plugin: 'com.android.application'
android {
compileSdkVersion 26
defaultConfig {
applicationId "aoto.com.baidurecongdemo"
minSdkVersion 21
targetSdkVersion 26
versionCode 1
versionName "1.0"
testInstrumentationRunner "android.support.test.runner.AndroidJUnitRunner"
//加载对应so文件
ndk{
abiFilters 'armeabi'
}
}
buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.pro'
}
}
}
dependencies {
implementation fileTree(include: ['*.jar'], dir: 'libs')
implementation 'com.android.support:appcompat-v7:26.1.0'
implementation 'com.android.support.constraint:constraint-layout:1.0.2'
testImplementation 'junit:junit:4.12'
androidTestImplementation 'com.android.support.test:runner:1.0.1'
androidTestImplementation 'com.android.support.test.espresso:espresso-core:3.0.1'
implementation files('libs/bdasr_V3_20171108_9800a2a.jar')
implementation files('libs/usc.jar')
}
3,注意事项:
(1)云知声离线TTSso文件不是X86的,所以window下的x86模拟器是用不了的,但是在window系统下运行arm的模拟器很卡,所以建议真机调试
(2)我这里在百度语音识别的地方只是实现了最简单的语音识别调用流程,很多的参数没有使用,学习者可以自己完善和添加
(3)我这里只添加了百度armeabi下的so文件,所以在gradle文件用ndk指定了(ndk已经安装),也可以什么都不加
(4)我这里主要是实现了一个复读机的功能,你说什么,那边会把识别文本语音合成播放出来,可以自己选择音色
(5)我这里百度开发了四种语言语音识别
注:欢迎扫码关注