本文使用Kotlin开发Android平台的一个语音识别方面的应用,用的是欧拉密开放平台olamisdk。
1.Kotlin简介
Kotlin是由JetBrains创建的基于JVM的编程语言,IntelliJ正是JetBrains的杰作,而android Studio是
基于IntelliJ修改而来的。Kotlin是一门包含很多函数式编程思想的面向对象编程语言。
后来了解到Kotlin原来是以一个岛的名字命名的(Котлин),它是一门静态类型编程语言,支持JVM平台,Android平台,浏览器JS运行环境,本地机器码等。支持与Java,Android 100% 完全互操作。Kotlin生来就是为了弥补Java缺失的现代语言的特性,并极大的简化了代码,使得开发者可以编写尽量少的样板代码。
2.Kotlin,java,Swift简单比较
JAVA: System.out.println("Hello,World!");
Kotlin: println("Hello,World!")
Swift: print("Hello,World!")
Java: int mVariable =10;
mVariable =20;
static final int mConstant = 10;
Kotlin:var mVariable = 10
mVariable = 20
val mConstant = 10
Swift:var mVariable = 10
mVariable = 20
let mConstant = 10
感觉Swift和Kotlin比Java简洁,Kotlin和swift很像。
Swift :
let label = "Hello world "
let width = 80
let widthLabel = label + String(width)
Kotlin :
val label = "Hello world "
val width = 80
val widthLabel = label + width
Swift :
var tempList = ["one", "two","three"]
tempList[1] = "zero"
Kotlin :
val tempList = arrayOf("one", "two","three")
tempList[1] = "zero"
Swift : func greet(_ name: String,_ day: String) -> String {
return "Hello \(name),today is \(day)." }
greet("Bob", "Tuesday")
Kotlin :
fun greet(name: String, day: String): String {
return "Hello $name, today is $day."}
greet("Bob", "Tuesday")
Swift :
声明:class Shape {
var numberOfSides = 0
func simpleDescription() -> String {
return "A shape with \(numberOfSides) sides."
}
}
用法:var shape = Shape()
shape.numberOfSides = 7
var shapeDescription = shape.simpleDescription()
Kotlin :
声明:class Shape {
var numberOfSides = 0
fun simpleDescription() = "A shape with $numberOfSides sides."
}
用法: var shape = Shape()
shape.numberOfSides = 7
var shapeDescription = shape.simpleDescription()
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
可见,Kotlin和Swift好像,现代语言的特征,比java这样的高级语言更加简化,更贴近自然语言。
3.开发环境
本文使用的是android studio2.0版本,启动androd studio。
如下图在configure下拉菜单中选择plugins,在搜索框中搜索Kotlin,找到结果列表中的”Kotlin”插件。
如下图,找了一张还没有安装kotlin插件的图
点击右侧intall,安装后重启studio.
4.新建android项目
你可以像以前使用android stuio一样新建一个andoid项目,建立一个activity。本文用已经完成的一个demo来做示范。
如下图是一个stuio的demo工程
选择MainActivity和MessageConst两个java文件,然后选择导航栏上的code,在下拉菜单中选择convert java file to kotlin file
系统会自动进行转化,转化完后会生成对应的MainActivity.kt MessageConst.kt文件,打开MainActivity.kt,编译器上方会提示”Kotlin not configured”,点击一下Configure按钮,IDE就会自动帮我们配置好了!
将两个kt文件复制到src/kotlin目录下,如下图
转化后的文件,也许有些语法错误,需要按照kotlin的语法修改。
环境配置好后,来看下gradle更新有哪些区别
project的gradle代码如下:
buildscript {
ext.kotlin_version = '1.1.3-2'
repositories {
jcenter()
}
dependencies {
classpath 'com.android.tools.build:gradle:2.0.0'
//此处多了kotlin插件依赖
classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version"
}
}
allprojects {
repositories {
jcenter()
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
再来看看某个module的gradle代码:
apply plugin: 'com.android.application'
apply plugin: 'kotlin-android'
android {
compileSdkVersion 14
buildToolsVersion "24.0.0"
defaultConfig {
applicationId "com.olami"
minSdkVersion 8
targetSdkVersion 14
}
buildTypes {
release {
minifyEnabled false
proguardFiles getDefaultProguardFile('proguard-android.txt'), 'proguard-rules.txt'
}
}
sourceSets {
main.java.srcDirs += 'src/main/kotlin'
}
}
dependencies {
compile 'com.android.support:support-v4:18.0.0'
compile files('libs/voicesdk_android.jar')
compile "org.jetbrains.kotlin:kotlin-stdlib:$kotlin_version"
}
repositories {
mavenCentral()
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
如上所示,如果不是通过转化的方式新建kotlin工程,则需要自己按照上面的gradle中增加的部分配置好。
5.olami语音识别应用
先贴一张识别后的效果图:
在MainActivity.kt中
override fun onCreate(savedInstanceState: Bundle?) {
super.onCreate(savedInstanceState)
setContentView(R.layout.activity_main)
initHandler()
initView()
initViaVoiceRecognizerListener()
init()
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
fun init()
{
initHandler()
mOlamiVoiceRecognizer = OlamiVoiceRecognizer(this@MainActivity)
val telephonyManager = this.getSystemService(
Context.TELEPHONY_SERVICE) as TelephonyManager
val imei = telephonyManager.deviceId
mOlamiVoiceRecognizer!!.init(imei)
mOlamiVoiceRecognizer!!.setListener(mOlamiVoiceRecognizerListener)
mOlamiVoiceRecognizer!!.setLocalization(
OlamiVoiceRecognizer.LANGUAGE_SIMPLIFIED_CHINESE)
mOlamiVoiceRecognizer!!.setAuthorization("51a4bb56ba954655a4fc834bfdc46af1",
"asr", "68bff251789b426896e70e888f919a6d", "nli")
mOlamiVoiceRecognizer!!.setVADTailTimeout(2000)
mOlamiVoiceRecognizer!!.setLatitudeAndLongitude(
31.155364678184498, 121.34882432933009)
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
代码比较简单,点击开始录音button后,启动录音,在OlamiVoiceRecognizerListener中回调处理,然后通过handler发送消息用于更新界面。
来看一下初始化view的代码,看看跟java方式书写有哪些不同
private fun initView()
{
mBtnStart = findViewById(R.id.btn_start) as Button
mBtnStop = findViewById(R.id.btn_stop) as Button
mBtnCancel = findViewById(R.id.btn_cancel) as Button
mBtnSend = findViewById(R.id.btn_send) as Button
mInputTextView = findViewById(R.id.tv_inputText) as TextView
mEditText = findViewById(R.id.et_content) as EditText
mTextView = findViewById(R.id.tv_result) as TextView
mTextViewVolume = findViewById(R.id.tv_volume) as TextView
mBtnStart!!.setOnClickListener {
if (mOlamiVoiceRecognizer != null)
mOlamiVoiceRecognizer!!.start()
}
mBtnStop!!.setOnClickListener {
if (mOlamiVoiceRecognizer != null)
mOlamiVoiceRecognizer!!.stop()
mBtnStart!!.text = "开始"
Log.i("led", "MusicActivity mBtnStop onclick 开始")
}
mBtnCancel!!.setOnClickListener {
if (mOlamiVoiceRecognizer != null)
mOlamiVoiceRecognizer!!.cancel()
}
mBtnSend!!.setOnClickListener {
if (mOlamiVoiceRecognizer != null)
mOlamiVoiceRecognizer!!.sendText(mEditText!!.text.toString())
mInputTextView!!.text = "输入: " + mEditText!!.text
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
是不是感觉代码更简练了?
下面两句赋值,效果相同,第二句可以用id之间进行文本赋值,比以前简练好多。
mInputTextView!!.text = "输入: " + mEditText!!.text
tv_inputText.text = "输入: " + et_content.text
再来看看handler:
private fun initHandler() {
mHandler = object : Handler() {
override fun handleMessage(msg: Message) {
when (msg.what) {
MessageConst.CLIENT_ACTION_START_RECORED -> mBtnStart!!.text
= "录音中"
MessageConst.CLIENT_ACTION_STOP_RECORED -> mBtnStart!!.text
= "识别中"
MessageConst.CLIENT_ACTION_CANCEL_RECORED -> {
mBtnStart!!.text = "开始"
mTextView!!.text = "已取消"
}
MessageConst.CLIENT_ACTION_ON_ERROR -> {
mTextView!!.text = "错误代码:" + msg.arg1
mBtnStart!!.text = "开始"
}
MessageConst.CLIENT_ACTION_UPDATA_VOLUME -> mTextViewVolume!!.text
= "音量: " + msg.arg1
MessageConst.SERVER_ACTION_RETURN_RESULT -> {
if (msg.obj != null)
mTextView!!.text = "服务器返回: " + msg.obj.toString()
mBtnStart!!.text = "开始"
try {
val message = msg.obj as String
var input: String? = null
val jsonObject = JSONObject(message)
val jArrayNli =
jsonObject.optJSONObject("data").optJSONArray("nli")
val jObj = jArrayNli.optJSONObject(0)
var jArraySemantic: JSONArray? = null
if (message.contains("semantic")) {
jArraySemantic = jObj.getJSONArray("semantic")
input =
jArraySemantic!!.optJSONObject(0).optString("input")
} else {
input = jsonObject.optJSONObject("data")
.optJSONObject("asr").optString("result")
}
if (input != null)
mInputTextView!!.text = "输入: " + input
} catch (e: Exception) {
e.printStackTrace()
}
}
}
}
}
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 37
- 38
- 39
- 40
- 41
- 42
- 43
- 44
- 45
- 46
- 47
- 48
- 49
原来的switch case的方式,变成了when***,代码不仅简练,更贴近现代语言,更容易理解。
上面的MessageConst.SERVER_ACTION_RETURN_RESULT时,获取了服务器返回的结果,紧接着对这段语义进行了简单的解析
{
"data": {
"asr": {
"result": "我要听三国演义",
"speech_status": 0,
"final": true,
"status": 0
},
"nli": [
{
"desc_obj": {
"result": "正在努力搜索中,请稍等",
"status": 0
},
"semantic": [
{
"app": "musiccontrol",
"input": "我要听三国演义",
"slots": [
{
"name": "songname",
"value": "三国演义" }
],
"modifier": [
"play"
],
"customer": "58df512384ae11f0bb7b487e"
}
],
"type": "musiccontrol"
}
]
},
"status": "ok"
}
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
- 1
- 2
- 3
- 4
- 5
- 6
- 7
- 8
- 9
- 10
- 11
- 12
- 13
- 14
- 15
- 16
- 17
- 18
- 19
- 20
- 21
- 22
- 23
- 24
- 25
- 26
- 27
- 28
- 29
- 30
- 31
- 32
- 33
- 34
- 35
- 36
1)解析出nli中type类型是musiccontrol,这是语法返回app的类型,而这个在线听书的demo只关心musiccontrol这 个app类型,其他的忽略。
2)用户说的话转成文字是在asr中的result中获取
3)在nli中的semantic中,input值是用户说的话,同asr中的result。
modifier代表返回的行为动作,此处可以看到是play就是要求播放,slots中的数据表示歌曲名称是三国演义。
那么动作是play,内容是歌曲名称是三国演义,在这个demo中调用
mBookUtil.searchBookAndPlay(songName,0,0);会先查询,查询到结果会再发播放消息要求播放,我要听三国演义这个流程就走完了。
这段是在线听书应用中的语义解析,详情请看博客:http://blog.csdn.net/ls0609/article/details/71519203
6.代码下载
用Koltlin实现android平台语音识别语义理解
7.相关博客
语音在线听书博客:http://blog.csdn.net/ls0609/article/details/71519203
语音记账demo:http://blog.csdn.net/ls0609/article/details/72765789
基于javascript用olamisdk实现web端语音识别语义理解(speex压缩)
http://blog.csdn.net/ls0609/article/details/73920229
olami开放平台语法编写简介:http://blog.csdn.net/ls0609/article/details/71624340
olami开放平台语法官方介绍:https://cn.olami.ai/wiki/?mp=nli&content=nli2.html
转载请注明CSDN博文地址:http://blog.csdn.net/ls0609/article/details/75084994