1) 生成需要识别的wav文件,SpeechRecognition需要wav文件,不能识别mp3文件
安装库:
sudo apt install espeak ffmpeg libespeak1
pip install pyttsx3
代码:
def demo_tts_wav():
import pyttsx3
engine = pyttsx3.init()
engine.setProperty('rate', 150)
engine.setProperty('volume', 1.0)
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[0].id)
text = '你好,我是一个AI机器人'
#engine.say(text)
filename = 'ni_hao.wav'
engine.save_to_file(text, filename)
engine.runAndWait()
2. 语音识别,使用speech_recognition
安装库:
pip install SpeechRecognition
pip install pocketsphinx
下载模型文件:CMU Sphinx - Browse /Acoustic and Language Models/Mandarin at SourceForge.net
pip install vosk
下载模型文件到代码目录下:VOSK Models
解压,并且重命名为model
代码
def demo_speech_recognition():
import speech_recognition as sr
r = sr.Recognizer()
try:
audio_file = sr.AudioFile('ni_hao.wav')
with audio_file as source:
audio_data = r.record(source)
#text = r.recognize_google(audio_data, language='zh-Cn')
#text = r.recognize_wit(audio_data)
text = r.recognize_vosk(audio_data, language='zh-Cn')
print("识别结果:", text)
except Exception as e:
print("无法识别语音:", str(e))
3. 使用whisper库,效果最好,可以离线
安装:
pip install -U openai-whisper
权重文件不方便下载的话可以到这下载:https://download.csdn.net/download/love_xunmeng/88651611
然后移动到:
mv small.pt /home/user_account/.cache/whisper/
代码:
def demo_whisper():
import whisper
model = whisper.load_model("small")
result = model.transcribe("ni_hao.wav")
print(result["text"])