Python语音基础操作--2.1语音录制,播放,读取

《语音信号处理试验教程》(梁瑞宇等)的代码主要是Matlab实现的,现在Python比较热门,所以把这个项目大部分内容写成了Python实现,大部分是手动写的。使用CSDN博客查看帮助文件:

Python语音基础操作–2.1语音录制,播放,读取
Python语音基础操作–2.2语音编辑
Python语音基础操作–2.3声强与响度
Python语音基础操作–2.4语音信号生成
Python语音基础操作–3.1语音分帧与加窗
Python语音基础操作–3.2短时时域分析
Python语音基础操作–3.3短时频域分析
Python语音基础操作–3.4倒谱分析与MFCC系数
Python语音基础操作–4.1语音端点检测
Python语音基础操作–4.2基音周期检测
Python语音基础操作–4.3共振峰估计
Python语音基础操作–5.1自适应滤波
Python语音基础操作–5.2谱减法
Python语音基础操作–5.4小波分解
Python语音基础操作–6.1PCM编码
Python语音基础操作–6.2LPC编码
Python语音基础操作–6.3ADPCM编码
Python语音基础操作–7.1帧合并
Python语音基础操作–7.2LPC的语音合成
Python语音基础操作–10.1基于动态时间规整(DTW)的孤立字语音识别试验
Python语音基础操作–10.2隐马尔科夫模型的孤立字识别
Python语音基础操作–11.1矢量量化(VQ)的说话人情感识别
Python语音基础操作–11.2基于GMM的说话人识别模型
Python语音基础操作–12.1基于KNN的情感识别
Python语音基础操作–12.2基于神经网络的情感识别
Python语音基础操作–12.3基于支持向量机SVM的语音情感识别
Python语音基础操作–12.4基于LDA,PCA的语音情感识别

代码可在Github上下载:busyyang/python_sound_open

语言录制

import pyaudio
import wave

CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 2
RATE = 16000
RECORD_SECONDS = 2
WAVE_OUTPUT_FILENAME = "Oldboy.wav"

p = pyaudio.PyAudio()

stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input=True,
                frames_per_buffer=CHUNK)

print("start recording......")

frames = []

for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    data = stream.read(CHUNK)
    frames.append(data)

print("end!")

stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

语音播放

"""PyAudio Example: Play a WAVE file."""

import pyaudio
import wave

CHUNK = 1024
FILENAME = 'C2_1_y.wav'


def player(filename=FILENAME):
    wf = wave.open(filename, 'rb')

    p = pyaudio.PyAudio()

    stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
                    channels=wf.getnchannels(),
                    rate=wf.getframerate(),
                    output=True)

    data = wf.readframes(CHUNK)

    while data != b'':
        stream.write(data)
        data = wf.readframes(CHUNK)

    stream.stop_stream()
    stream.close()

    p.terminate()


player(FILENAME)

读取并可视化

import librosa  # 填充,默认频率为22050,可以改变频率
from scipy.io import wavfile  # 原音无损
import numpy as np
import librosa.display
import matplotlib.pyplot as plt
fs, data = wavfile.read('C2_1_y.wav')  # 原始频率,原始数据
print("长度 = {0} 秒".format(len(data) / fs))
data1, sample_rate = librosa.load('C2_1_y.wav')
print("长度 = {0} 秒".format(len(data1) / sample_rate))
plt.figure(figsize=(14, 5))
librosa.display.waveplot(data1, sample_rate)
plt.show()

将上述的一些函数写在一个类里面,可以帮助后续开发过程。

通用基础类

import pyaudio
import wave
import librosa
import librosa.display
import matplotlib.pyplot as plt


# from scipy.io import wavfile

class soundBase:
    def __init__(self, path):
        self.path = path

    def audiorecorder(self, len=2, formater=pyaudio.paInt16, rate=16000, frames_per_buffer=1024, channels=2):
        p = pyaudio.PyAudio()
        stream = p.open(format=formater, channels=channels, rate=rate, input=True, frames_per_buffer=frames_per_buffer)
        print("start recording......")
        frames = []
        for i in range(0, int(rate / frames_per_buffer * len)):
            data = stream.read(frames_per_buffer)
            frames.append(data)
        print("stop recording......")
        stream.stop_stream()
        stream.close()
        p.terminate()
        wf = wave.open(self.path, 'wb')
        wf.setnchannels(channels)
        wf.setsampwidth(p.get_sample_size(formater))
        wf.setframerate(rate)
        wf.writeframes(b''.join(frames))
        wf.close()

    def audioplayer(self, frames_per_buffer=1024):
        wf = wave.open(self.path, 'rb')
        p = pyaudio.PyAudio()
        stream = p.open(format=p.get_format_from_width(wf.getsampwidth()),
                        channels=wf.getnchannels(),
                        rate=wf.getframerate(),
                        output=True)
        data = wf.readframes(frames_per_buffer)
        while data != b'':
            stream.write(data)
            data = wf.readframes(frames_per_buffer)

        stream.stop_stream()
        stream.close()
        p.terminate()

    def audiowrite(self):
        pass

    def audioread(self):
        data, sample_rate = librosa.load(self.path)
        return data, sample_rate

    def soundplot(self, data=[], sr=22050, size=(14, 5)):
        if len(data) == 0:
            data, _ = self.audioread()
        plt.figure(figsize=size)
        librosa.display.waveplot(data, sr=sr)
        plt.show()


sb = soundBase('C2_1_y.wav')
data, sr = sb.audioread()
sb.soundplot(data, sr)

matlab版本更新后函数更新

在实现python代码的时候,有参考matlab的一些相关实现,发现部分之前的matlab函数名已经更新,有整理下面的表格。

matlab老版本(如matlab2010)使用的函数名在新版本中已经取消了,对应关系如下:

wavrecord audiorecorder
wavplay audioplayer
wavwrite audioplayer

你可能感兴趣的:(Python,语音信号)