声明:参考B站视频,自学成长记录
https://www.bilibili.com/video/BV1Jk4y1R7a5?p=2
并参考博客:https://blog.csdn.net/Datapad/article/details/82970253
安装SpeechRecognition
C:\Users\Administrator>pip3 install SpeechRecognition
......
Installing collected packages: SpeechRecognition
Successfully installed SpeechRecognition-3.8.1
安装成功入下图
安装pocketsphinx
C:\Users\Administrator>pip install pocketsphinx
......
Installing collected packages: pocketsphinx
Successfully installed pocketsphinx-0.1.15
安装成功入下图
代码示例
实现将wav格式的语音文件读取并进行整体 / 部分识别
import speech_recognition as sr
r = sr.Recognizer() # 调用识别器
harvard = sr.AudioFile('E:\speek\harvard.wav') # 导入语音文件
# 上下文管理器打开文件并读取其内容
with harvard as source:
all_audio = r.record(source) # 使用record()从文件中捕获数据
# 查看类型
print(type(all_audio)) #
all_text = r.recognize_sphinx(all_audio) # 识别输出
print(all_text)
# this they'll smell of old we're lingers it takes heat to
# bring out the odor called it restores health and zest
# case all the colt is fine with him couples all pastore
# my favorite is as full food is the hot cross mon
# 识别部分文件并输出
with harvard as source:
# 分割视频文件 指定偏移量及持续时间
audio = r.record(source, offset=4, duration=3) # 从第4秒开始,持续时间3秒
text = r.recognize_sphinx(audio) # 识别输出
print(text) # it takes heat to bring out the odor
注意事项
1、预先知道音频文件中语音的结构,那么offset和duration关键字参数对于分割音频文件非常有用。然而,匆忙使用它们会导致转录不良
2、音频文件类型以PCM WAV、AIFF/AIFF- c或本机FLAC读取音频文件,不然会报错
3、声音文件链接:https://pan.baidu.com/s/10oClt_NWgjOsDmIPuqQGzg 提取码:0wv4