IOS之语音识别技术

前述：自从看了罗永浩在锤子手机发布会上，展示了一下语音识别技术，锤子手机集成的是讯飞语音技术。感觉好牛逼，很高大上的样子。因为这样我对语音识别技术开始感兴趣了。最近在网上逛网站、技术博客时，看到了一篇介绍苹果自带的语音识别组件Speech的文章。并且仔细查看了该组建的开发文档。所以着手写了这篇文章。

一、SPeech组成部分

SFSpeechRecognizer.h 语音识别器

SFSpeechRecognitionRequest.h 语音识别请求

SFSpeechRecognitionTask.h 语音识别任务

SFSpeechRecognitionTaskHint.h 语音识别的类型

SFSpeechRecognitionResult.h 语音识别的结果

SFTranscriptionSegment.h 转录的子串

SFTranscription.h 语音录制的文本形式

二、语音识别前的验证及准备

声明三个属性：

@property (weak, nonatomic) IBOutlet UITextView *wordTextView;

@property (weak, nonatomic) IBOutlet UIButton *recordBtn;

@property(nonatomic,strong) SFSpeechRecognizer *recognizer;

//识别功能

@property(nonatomic,strong) SFSpeechAudioBufferRecognitionRequest *recognitionRequest;

@property(nonatomic,strong) SFSpeechRecognitionTask *recognitionTask;

@property(nonatomic,strong) AVAudioEngine *engine;

[SFSpeechRecognizer supportedLocales] //当前苹果支持语音识别的时区目前支持62门语言

NSLocale *cale = [[NSLocale alloc] initWithLocaleIdentifier:@"zh-CN"]; //时区对象

self.recognizer = [[SFSpeechRecognizer alloc] initWithLocale:cale]; //用时区来初始化识别器目前只支持识别一个时区

self.recognizer.delegate = self; //设置代理

//识别器代理方法语音识别识别改变的代理方法

-(void)speechRecognizer:(SFSpeechRecognizer *)speechRecognizer availabilityDidChange:(BOOL)available {

//设置控制语音识别按钮的是否可点击

}

//检查设备是否支持语音识别

//注意//注意要在info中加入私有白名单 Privacy - Speech Recognition Usage Description

[SFSpeechRecognizer requestAuthorization:^(SFSpeechRecognizerAuthorizationStatus status) {

BOOL isButtonEnabled = false;

switch (status) {

case SFSpeechRecognizerAuthorizationStatusDenied:

//设置按钮是否可点击

isButtonEnabled = false;

NSLog(@"用户被拒绝访问语音识别");

break;

case SFSpeechRecognizerAuthorizationStatusAuthorized:

isButtonEnabled = true;

NSLog(@"可以语音识别");

break;

case SFSpeechRecognizerAuthorizationStatusRestricted:

isButtonEnabled = false;

NSLog(@"不能在该设备上进行语音识别");

break;

case SFSpeechRecognizerAuthorizationStatusNotDetermined:

isButtonEnabled = false;

NSLog(@"没有授权");

break;

default:

break;

}

//注意当前线程是子线程

NSLog(@"%@",[NSThread currentThread]);

dispatch_async(dispatch_get_main_queue(), ^{

//回到主线程刷新UI，设置按钮是否可点击

self.recordBtn.enabled = isButtonEnabled;

});

}];

三、语音识别

//开始录制 - 识别语音转文字

- (void)startRecording {

if (self.recognitionTask) {

[self.recognitionTask cancel];

self.recognitionTask = nil;

}

//判断语音录入是否可用

AVAudioSession *audioSession = [AVAudioSession sharedInstance];

//注意要在info中加入私有白名单 Privacy - Microphone Usage Description

BOOL audioBool = [audioSession setCategory:AVAudioSessionCategoryRecord error:nil];

//Category

AVAudioSessionCategoryPlayAndRecord 录制音频时使用这个类别

AVAudioSessionCategoryAmbient 使用这个类别的背景声音，如雨，汽车发动机噪音，等等

AVAudioSessionCategorySoloAmbient 使用这个类别的背景声音。其他的音乐将停止演奏

AVAudioSessionCategoryPlayback 使用这类音乐曲目

AVAudioSessionCategoryPlayAndRecord 在录制和回放音频时使用这个类别

BOOL audioBool1 = [audioSession setMode:AVAudioSessionModeMeasurement error:nil];

//mode

AVAudioSessionModeMeasurement 适用于希望尽量减少系统提供的信号效果的应用程序处理输入和/或输出音频信号

//激活音频会话

BOOL audioBool2 = [audioSession setActive:true withOptions:AVAudioSessionSetActiveOptionNotifyOthersOnDeactivation error:nil];

if (audioBool || audioBool1 || audioBool2) {

NSLog(@"可以使用");

} else {

NSLog(@"这里说明有的功能不支持");

}

//创建识别请求: 从任意音频缓冲区识别语音的请求

self.recognitionRequest = [[SFSpeechAudioBufferRecognitionRequest alloc] init];

AVAudioInputNode *inputNode = self.engine.inputNode;

//报告这个识别是否是最终结果

self.recognitionRequest.shouldReportPartialResults = true;

//开始识别任务

self.recognitionTask = [self.recognizer recognitionTaskWithRequest:self.recognitionRequest resultHandler:^(SFSpeechRecognitionResult * _Nullable result, NSError * _Nullable error) {

BOOL isFinal = false;

if (result) {

//语音转文本

self.wordTextView.text = [[result bestTranscription] formattedString];

isFinal = [result isFinal];

}

if (error || isFinal) {

//没有识别到，

[self.engine stop];

销毁节点

[inputNode removeTapOnBus:0];

self.recognitionRequest = nil;

self.recognitionTask = nil;

self.recordBtn.enabled = true;

}

}];

AVAudioFormat *recordingFormat = [inputNode outputFormatForBus:0];

////连接上次的语音输入

[inputNode installTapOnBus:0 bufferSize:1024 format:recordingFormat block:^(AVAudioPCMBuffer * _Nonnull buffer, AVAudioTime * _Nonnull when) {

[self.recognitionRequest appendAudioPCMBuffer:buffer];

}];

//识别器准备

[self.engine prepare];

BOOL audioEngineBool = [self.engine startAndReturnError:nil];

NSLog(@"audioEngineBool----%d",audioEngineBool);

}

小伙伴们，这样就识别成功了。

IOS之语音识别技术

你可能感兴趣的:(IOS之语音识别技术)