微软智能认知接口研究-语音识别

官方文档:https://www.azure.cn/cognitive-services/en-us/speech-api/documentation/api-reference-rest/bingvoicerecognition

语音识别需要先获取token才能进行后续操作。token的时效是600秒。一旦token过期,再次调用识别接口,将返回403错误。

Token获取

Method : POST

URL : https://oxford-speech.cloudapp.net/token/issueToken

Content-Type : application/x-www-form-urlencoded

参数列表 :(HTTPBody)

grant_type = client_credentials

client_id = Your subscription key

client_secret = Your subscription key

scope = https://speech.platform.bing.com

举例:

grant_type=client_credentials&client_id=&client_secret=&scope=https://speech.platform.bing.com

返回值:

Content-Type: application/json; charset=utf-8

{

"access_token":,

"token_type":"jet",

"expires_in":"600",

"scope":"https://speech.platform.bing.com"

}

语音识别成文本

Method : POST

BaseURL : https://speech.platform.bing.com/recognize

HTTP query parameters:

必需设定的参数:

version=3.0

requestid=b2c95ede-97eb-4c88-81e4-80f32d6aee53 (自定义请求Id)

appID=D4D52672-91D7-4C74-8AD8-42B1D98141A3 (自定义AppId)

format=json

locale=zh-CN (IETF RFC 5646约定的语言代码)

device.os=iOS7 (Windows OS, Windows Phone OS, XBOX, Android, iPhone OS. Example: device.os=Windows Phone OS)

scenarios=ulm (识别方式,ulm, websearch)

instanceid=b2c95ede-97eb-4c88-81e4-80f32d6aee53 (自定义对象Id)

可选参数:

maxnbest=3 (同一个请求的最大返回次数,默认为1)

result.profanity=1 (对于识别结果文本,根据一个坏词列表,过滤不好的词,并标记这些识别出来的坏词。0关闭功能,1开启功能。默认为1)

举例1:

version=3.0&requestid=b2c95ede-97eb-4c88-81e4-80f32d6aee53&appID=D4D52672-91D7-4C74-8AD8-42B1D98141A3&format=json&locale=zh-CN&device.os=iOS7&scenarios=ulm&instanceid=b2c95ede-97eb-4c88-81e4-80f32d6aee53

举例2:

scenarios=catsearch&appid=f84e364c-ec34-4773-a783-73707bd9a583&locale=en-US&device.os=wp7&version=3.0&format=xml&requestid=1d4b6030-9099-11e0-91e4-0800200c9a63&instanceid=1d4b6030-9099-11e0-91e4-0800200c9a63

HTTPHeaderFields:

Content-Type : audio/wav; samplerate=8000

Authorization : Bearer

HTTPBody:

返回值:(简化版)

{

header: { name: string }

}

name的内容就是识别的结果

你可能感兴趣的:(微软智能认知接口研究-语音识别)