IOS开发学习笔记(二) 语音识别

    上次简单地讲解了如何利用科大讯飞完成语音合成,今天接着也把语音识别整理一下。当然,写代码前我们需要做的一些工作(如申请appid、导库),在上一篇语音合成的文章当中已经说过了,不了解的可以看看我上次的博文,那么这次直接从堆代码开始吧。

详细步骤:

1.导完类库之后,在工程里添加好用的头文件。在视图里只用了一个UITextField显示识别的内容,两个UIButton(一个开始监听语音,一个结束监听),然后引入类、添加代理,和语音合成的一样。

MainViewController.h

 1 #import <UIKit/UIKit.h>

 2 #import "iflyMSC/IFlySpeechRecognizerDelegate.h"

 3 //引入语音识别类

 4 @class IFlyDataUploader;

 5 @class IFlySpeechUnderstander;

 6 //注意要添加语音识别代理

 7 @interface MainViewController : UIViewController<IFlySpeechRecognizerDelegate>

 8 @property (nonatomic,strong) IFlySpeechUnderstander *iFlySpeechUnderstander;

 9 @property (strong, nonatomic) IBOutlet UITextField *content;

10 @property (nonatomic,strong) NSString               *result;

11 @property (nonatomic,strong) NSString               *str_result;

12 @property (nonatomic)         BOOL                  isCanceled;

13 

14 - (IBAction)understand:(id)sender;

15 - (IBAction)finish:(id)sender;

16 

17 @end

MainViewController.m

  1 #import "MainViewController.h"

  2 #import <QuartzCore/QuartzCore.h>

  3 #import <AVFoundation/AVAudioSession.h>

  4 #import <AudioToolbox/AudioSession.h>

  5 

  6 #import "iflyMSC/IFlyContact.h"

  7 #import "iflyMSC/IFlyDataUploader.h"

  8 #import "iflyMSC/IFlyUserWords.h"

  9 #import "iflyMSC/IFlySpeechUtility.h"

 10 #import "iflyMSC/IFlySpeechUnderstander.h"

 11 

 12 @interface MainViewController ()

 13 

 14 @end

 15 

 16 @implementation MainViewController

 17 

 18 - (void)viewDidLoad

 19 {

 20     [super viewDidLoad];

 21     //创建识别对象

 22     //创建语音配置

 23     NSString *initString = [[NSString alloc] initWithFormat:@"appid=%@,timeout=%@",@"53b5560a",@"20000"];   

 24     

 25     //所有服务启动前,需要确保执行createUtility

 26     [IFlySpeechUtility createUtility:initString];

 27     _iFlySpeechUnderstander = [IFlySpeechUnderstander sharedInstance];

 28     _iFlySpeechUnderstander.delegate = self;

 29 }

 30 

 31 -(void)viewWillDisappear:(BOOL)animated

 32 {

 33     [_iFlySpeechUnderstander cancel];

 34     _iFlySpeechUnderstander.delegate = nil;

 35     //设置回非语义识别

 36     [_iFlySpeechUnderstander destroy];

 37     [super viewWillDisappear:animated];

 38 }

 39 

 40 - (void)didReceiveMemoryWarning

 41 {

 42     [super didReceiveMemoryWarning];

 43 }

 44 

 45 - (IBAction)understand:(id)sender {

 46     bool ret = [_iFlySpeechUnderstander startListening];  //开始监听

 47     if (ret) {

 48         self.isCanceled = NO;

 49     }

 50     else{

 51         NSLog(@"启动识别失败!");

 52     }  

 53 }

 54 

 55 - (IBAction)finish:(id)sender {

 56     [_iFlySpeechUnderstander stopListening];   //结束监听,并开始识别

 57 }

 58 

 59 #pragma mark - IFlySpeechRecognizerDelegate

 60 /**

 61  * @fn      onVolumeChanged

 62  * @brief   音量变化回调

 63  * @param   volume      -[in] 录音的音量,音量范围1~100

 64  * @see

 65  */

 66 - (void) onVolumeChanged: (int)volume

 67 {

 68     

 69 }

 70 

 71 /**

 72  * @fn      onBeginOfSpeech

 73  * @brief   开始识别回调

 74  * @see

 75  */

 76 - (void) onBeginOfSpeech

 77 {

 78    

 79 }

 80 

 81 /**

 82  * @fn      onEndOfSpeech

 83  * @brief   停止录音回调

 84  * @see

 85  */

 86 - (void) onEndOfSpeech

 87 {

 88    

 89 }

 90 

 91 /**

 92  * @fn      onError

 93  * @brief   识别结束回调

 94  * @param   errorCode   -[out] 错误类,具体用法见IFlySpeechError

 95  */

 96 - (void) onError:(IFlySpeechError *) error

 97 {

 98     NSString *text ;

 99     if (self.isCanceled) {

100         text = @"识别取消";

101     }

102     else if (error.errorCode ==0 ) {

103         if (_result.length==0) {

104             text = @"无识别结果";

105         }

106         else{

107             text = @"识别成功";

108         }

109     }

110     else{

111         text = [NSString stringWithFormat:@"发生错误:%d %@",error.errorCode,error.errorDesc];

112         NSLog(@"%@",text);

113     }

114 }

115 

116 /**

117  * @fn      onResults

118  * @brief   识别结果回调

119  * @param   result      -[out] 识别结果,NSArray的第一个元素为NSDictionary,NSDictionary的key为识别结果,value为置信度

120  * @see

121  */

122 - (void) onResults:(NSArray *) results isLast:(BOOL)isLast

123 {

124     NSArray * temp = [[NSArray alloc]init];

125     NSString * str = [[NSString alloc]init];

126     NSMutableString *result = [[NSMutableString alloc] init];

127     NSDictionary *dic = results[0];

128     for (NSString *key in dic) {

129         [result appendFormat:@"%@",key];

130         

131     }

132     NSLog(@"听写结果:%@",result);

133     //---------讯飞语音识别JSON数据解析---------//

134     NSError * error;

135     NSData * data = [result dataUsingEncoding:NSUTF8StringEncoding];

136     NSLog(@"data: %@",data);

137     NSDictionary * dic_result =[NSJSONSerialization JSONObjectWithData:data options:NSJSONReadingMutableLeaves error:&error];

138     NSArray * array_ws = [dic_result objectForKey:@"ws"];

139     //遍历识别结果的每一个单词

140     for (int i=0; i<array_ws.count; i++) {

141         temp = [[array_ws objectAtIndex:i] objectForKey:@"cw"];

142         NSDictionary * dic_cw = [temp objectAtIndex:0];

143         str = [str  stringByAppendingString:[dic_cw objectForKey:@"w"]];

144         NSLog(@"识别结果:%@",[dic_cw objectForKey:@"w"]);

145     }

146     NSLog(@"最终的识别结果:%@",str);

147     //去掉识别结果最后的标点符号

148     if ([str isEqualToString:@""] || [str isEqualToString:@""] || [str isEqualToString:@""]) {

149         NSLog(@"末尾标点符号:%@",str);

150     }

151     else{

152         self.content.text = str;

153     }

154     _result = str;

155 }

156 

157 @end

2.语音识别和语音合成在api函数的使用上大体相似,但语音识别返回的结果是json数据格式的。然后我在查看官方提供的sdk文件中,发现有UI界面的Sample里使用的语音识别返回的竟是原原本本的字符串,本来是想换掉原来的语音识别方法,但实在不想用官方的UI界面,所以就自己分析识别结果,写了一个Json数据解析。经过测试,Json数据解析的效果还是可以的,通过遍历把分割的识别内容拼接起来,最后组合成一句完整的话就行了。

语法识别结果示例:

{"sn":1,"ls":true,"bg":0,"ed":0,"ws":[{"bg":0,"cw":[{"w":" 今天 ","sc":0}]},{"bg":0,"cw":[{"w":"

","sc":0}]},{"bg":0,"cw":[{"w":" 天气 ","sc":0}]},{"bg":0,"cw":[{"w":" 怎么

样 ","sc":0}]},{"bg":0,"cw":[{"w":"","sc":0}]}]}

这次的文章就到这里,大家还有什么不懂的,可以给我留言,我会尽快回复!

你可能感兴趣的:(ios开发)