最近调研了一下科大讯飞智能语音服务AIUI,并基于官方Demo二次开发了一个比较简单的demo(Android->AIUI->服务器后处理),体验了一下功能。
官方的Android Demo中提供了“语音听写”、“语法识别”、“语义理解”、“语音合成”、“声纹密码”等功能,我个人主要使用了“语义理解”和“语音合成”
之前的文章介绍过Amazon的Alexa,AIUI应该想做类似的事情,但是目前AIUI的功能有限,开发文档也一般(大篇幅介绍网页控制台的操作,很多关键的地方描述的不清楚,比如“自定义技能”、“自定义实体”这些开发者非常关心的功能,开发文档只简单的介绍了一下怎么添加;“自定义实体”是个什么鬼都没说就over了);另外,官方DEMO功能都简单,尤其是第三方应用开发者需要用到的“后处理”功能,资料少之又少,官方只给出来一个简单的不能再简单的例子(只是实现了AIUI将语义分析结果转发给第三方的后处理,后处理收到了就没有了...看到那儿的时候真的很无语啊...后处理收到数据之后,要怎么处理,要用什么格式返回数据,返回的数据怎么用,一点儿没提,o(╯□╰)o服了)
目前AIUI适合的场景
语音引导:比如开放技能中“天气”这种通用的功能,查询时不需要跟用户信息关联、有可预知的关键字“天气”等(所谓的“自定义实体”)
AIUI不能满足的
无法提前设置语料库的情况,例如与账户关联的、语音信息录入(某个问诊APP,录入用户姓名等)
调研之前想实现的功能
- 语音引导:类似于ATM这种自助终端,用户通过语音来与终端交互;比如终端问“您好,有什么需要”,用户答“拍照”,然后终端进入拍照功能;
这种需求目前AIUI是可以满足的,怎么实现,文章后面会介绍
- 语音信息录入:通过用户的语音来完成信息录入,比如通过语音录入用户的姓名、年龄、患病情况等;
目前AIUI不适用这种场景,发音一样字不一样的情况太多了,比如wangxiaoer对应的名字可能是“王小二”、“王晓二”,这种情况下,用语音方式进行信息录入,可就得不偿失了。
自定义技能
开放技能中没有想要的功能时需要自己实现自定义技能,通过语义理解提取到关键字,在“后处理”服务器去执行一些操作。
比如在自定义技能中,设置语料“我尚”,当用户说“我尚”之后,AIUI会触发这个自定义技能,其中的query就是“我尚”,后处理服务收到这个语义后,可以做一些操作,返回数据;
{
"category": "ISHANG.mHealth_demo:11.0",
"intentType": "custom",
"query": "我尚",
"query_ws": "我/NP// 尚/ADD//",
"rc": 0,
"nlis": "true",
"service": "ISHANG.mHealth_demo",
"uuid": "atn000167dc@ch60f10d1d86686f2601",
"vendor": "ISHANG",
"version": "11.0",
"semantic": [
{
"intent": "init",
"score": 1,
"slots": []
}
],
"sid": "atn000167dc@ch60f10d1d86686f2601",
"text": "我尚"
}
这里注意rc字段,0表示语义理解成功,如果语义理解不成功是这样的(rc为4):
{
"rc": 4,
"uuid": "atn00016bf3@ch60f10d1d86f86f2601",
"sid": "atn00016bf3@ch60f10d1d86f86f2601",
"text": "大王"
}
自定义实体
自定义技能如果用"我是{name}",这个{name}就是一个自定义实体(可以理解为语料库),开放的自定义实体里有省、城市、歌曲名等等,如果没有,就自定义一个“张三”,在语义理解时,会出现在语义槽中
{
"category": "ISHANG.mHealth_demo:11.0",
"intentType": "custom",
"query": "我是张三",
"query_ws": "我/NP// 是/V_SHI// 张三/NPP//",
"rc": 0,
"nlis": "true",
"service": "ISHANG.mHealth_demo",
"uuid": "atn00016f72@ch0de90d1d87956f2a01",
"vendor": "ISHANG",
"version": "11.0",
"semantic": [
{
"intent": "input_name",
"score": 1,
"slots": [
{
"begin": 2,
"end": 4,
"name": "name",
"normValue": "张三",
"value": "张三"
}
]
}
],
"sid": "atn00016f72@ch0de90d1d87956f2a01",
"text": "我是张三"
}
这里注意,“张三”已经在自定义实体中添加过了,在json中出现在semantic[0].slots[0]
这个字段,这就是语义理解的精髓所在了,就是需要你提前在语料库中添加好语料,在语义理解结果中,这个语料就可以单独提出来,在后处理作为业务逻辑参数使用
但是,语料库必须提前录入好,否则语义理解就失败了;但是有些时候——比如录入姓名——除非将全中国人的姓名做成语料库,否则语义理解就失败了,该怎么办?
如果是一个没有添加的实体,返回如下,rc为4也就是无法理解语义
{
"rc": 4,
"uuid": "atn000176af@ch0de90d1d88736f2a01",
"sid": "atn000176af@ch0de90d1d88736f2a01",
"text": "我是例子"
}
后处理
如果设置了后处理,AIUI服务器会将语义理解的结果转发给后处理服务器,在后处理服务器通过post方法接收AIUI转发请求的函数(或者方法)里,我们可以提取语义理解的结果,做一些查询等操作,然后返回;
post请求的数据
关键数据保存在Msg.Content字段,这里要注意的是SessionParams和Msg.Content是Base64编码之后的,使用时需要先解码,解码之后完整的请求数据如下:
{"MsgId":"cid6f1c2494@ch00270d1c09d20100101","CreateTime":1505803732,"AppId":"59bf6334","UserId":"d3146084944","SessionParams":{"dsrc":"sdk","dts":"1","dtype":"audio","msc.lat":"39.895252","msc.lng":"116.343834","scene":"main","scity":"ch","sid":"cid6f1c2494@ch00270d1c09d2010010","stmid":"audio-16","ver_type":"mobile_phone","wake_id":"15058037304161d1c41c87ab7cd3c"},"UserParams":"","FromSub":"kc","Msg":{"ContentType":"json","Type":"text","Content":{"intent":{"data":{"result":[{"airData":40,"airQuality":"优","city":"北京","date":"2017-09-19","dateLong":1505750400,"exp":{"ct":{"expName":"穿衣指数","level":"热","prompt":"天气热,建议着短裙、短裤、短薄外套、T恤等夏季服装。"}},"humidity":"20%","lastUpdateTime":"2017-09-19 11:39:20","pm25":"13","temp":29,"tempRange":"14℃~30℃","weather":"晴","weatherType":0,"wind":"西北风3-4级","windLevel":1},{"city":"北京","date":"2017-09-20","dateLong":1505836800,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"14℃~27℃","weather":"晴","weatherType":0,"wind":"南风微风","windLevel":0},{"city":"北京","date":"2017-09-21","dateLong":1505923200,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"17℃~28℃","weather":"多云","weatherType":1,"wind":"南风微风","windLevel":0},{"city":"北京","date":"2017-09-22","dateLong":1506009600,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"15℃~28℃","weather":"晴","weatherType":0,"wind":"西北风微风","windLevel":0},{"city":"北京","date":"2017-09-23","dateLong":1506096000,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"18℃~29℃","weather":"晴转多云","weatherType":0,"wind":"南风微风","windLevel":0},{"city":"北京","date":"2017-09-24","dateLong":1506182400,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"19℃~28℃","weather":"阴","weatherType":2,"wind":"东风微风","windLevel":0},{"city":"北京","date":"2017-09-25","dateLong":1506268800,"lastUpdateTime":"2017-09-19 11:39:20","tempRange":"19℃~28℃","weather":"多云转阴","weatherType":1,"wind":"东南风微风","windLevel":0}]},"rc":0,"semantic":[{"intent":"QUERY","slots":[{"name":"location.city","value":"CURRENT_CITY","normValue":"CURRENT_CITY"},{"name":"location.poi","value":"CURRENT_POI","normValue":"CURRENT_POI"},{"name":"location.type","value":"LOC_POI","normValue":"LOC_POI"},{"name":"queryType","value":"内容"},{"name":"subfocus","value":"天气状态"}]}],"service":"weather","text":"天气","uuid":"atn00913a37@ch46b50d1c09d46f2a01","used_state":{"state_key":"fg::weather::default::default","state":"default"},"answer":{"text":"\"北京\"今天\"晴\",\"14℃~30℃\",\"西北风3-4级\""},"dialog_stat":"DataValid","sid":"cid6f1c2494@ch00270d1c09d2010010"}}}}
返回格式
参照开放技能“天气”等json数据返回,把结果放在data或者intent的answer里,其它字段还是用post请求发过来的数据。“天气”语义理解之后的数据如下:
{
"data": {
"result": [
{
"airData": 44,
"airQuality": "优",
"city": "北京",
"date": "2017-09-20",
"dateLong": 1505836800,
"exp": {
"ct": {
"expName": "穿衣指数",
"level": "热",
"prompt": "天气热,建议着短裙、短裤、短薄外套、T恤等夏季服装。"
}
},
"humidity": "25%",
"lastUpdateTime": "2017-09-20 11:07:03",
"pm25": "10",
"temp": 24,
"tempRange": "14℃~27℃",
"weather": "晴",
"weatherType": 0,
"wind": "北风微风",
"windLevel": 0
},
{
"city": "北京",
"date": "2017-09-21",
"dateLong": 1505923200,
"lastUpdateTime": "2017-09-20 11:07:03",
"tempRange": "17℃~29℃",
"weather": "多云",
"weatherType": 1,
"wind": "南风微风",
"windLevel": 0
},
{
"city": "北京",
"date": "2017-09-22",
"dateLong": 1506009600,
"lastUpdateTime": "2017-09-20 11:07:03",
"tempRange": "13℃~27℃",
"weather": "晴",
"weatherType": 0,
"wind": "西北风微风",
"windLevel": 0
},
{
"city": "北京",
"date": "2017-09-23",
"dateLong": 1506096000,
"lastUpdateTime": "2017-09-20 11:07:03",
"tempRange": "18℃~29℃",
"weather": "晴转多云",
"weatherType": 0,
"wind": "南风微风",
"windLevel": 0
},
{
"city": "北京",
"date": "2017-09-24",
"dateLong": 1506182400,
"lastUpdateTime": "2017-09-20 11:07:03",
"tempRange": "19℃~28℃",
"weather": "阴",
"weatherType": 2,
"wind": "东风微风",
"windLevel": 0
},
{
"city": "北京",
"date": "2017-09-25",
"dateLong": 1506268800,
"lastUpdateTime": "2017-09-20 11:07:03",
"tempRange": "19℃~28℃",
"weather": "多云转阴",
"weatherType": 1,
"wind": "东南风微风",
"windLevel": 0
},
{
"city": "北京",
"date": "2017-09-26",
"dateLong": 1506355200,
"lastUpdateTime": "2017-09-20 11:07:03",
"tempRange": "13℃~25℃",
"weather": "晴",
"weatherType": 0,
"wind": "西北风3-4级",
"windLevel": 1
}
]
},
"rc": 0,
"semantic": [
{
"intent": "QUERY",
"slots": [
{
"name": "location.city",
"value": "CURRENT_CITY",
"normValue": "CURRENT_CITY"
},
{
"name": "location.poi",
"value": "CURRENT_POI",
"normValue": "CURRENT_POI"
},
{
"name": "location.type",
"value": "LOC_POI",
"normValue": "LOC_POI"
},
{
"name": "queryType",
"value": "内容"
},
{
"name": "subfocus",
"value": "天气状态"
}
]
}
],
"service": "weather",
"text": "天气",
"uuid": "atn00018593@ch60f10d1d8a556f2601",
"used_state": {
"state_key": "fg::weather::default::default",
"state": "default"
},
"answer": {
"text": "\"北京\"今天\"晴\",\"14℃~27℃\",\"北风微风\""
},
"dialog_stat": "DataValid",
"sid": "atn00018593@ch60f10d1d8a556f2601"
}
DEMO
Demo地址,Demo包括后处理服务端和Android App
后处理服务端
nodejs实现
- get请求处理方法中,主要是实现了aiui后处理服务器验证
- post请求处理方法中,实现了一个非常简单的状态机,使用aiui发来的语义结果结合一个code变量,来控制返回什么样的数据;返回数据格式参照开放技能“天气”
Android App
基于官方demo完成(在aiui上注册android应用之后,该应用的设置界面可下载工程代码,遗憾的是这货居然是个eclipse工程o(╯□╰)o)
- 在语义理解demo中基于语音合成功能,加入了语音播报结果功能
- 说“我尚”,返回“欢迎您使用...请说出自己的名字”,然后说“张三”,返回“您的名字是张三,请说出您的年龄”;然后说“28”,返回“您的年龄是28,谢谢使用,再见”。