一、语谱图实现
声音数据,单声道,绘制一个waterfall的图,x坐标是时间t,y坐标是语谱图频率f,z坐标是语谱图幅值输出的模值(dB为单位)
根据BAC009S0002W0122.wav的人声对话文件进行时间-幅值、频谱图、语谱图、时间-频率瀑布图为如下所示
其中声音文件采样频率为16000Hz截取声音文件3.125s,即采样点数为50000,声音的主要频率为179.5Hz,其余为噪声频率
二·、傅里叶变换——频谱图
%Do Fourier Transform
y_fft = abs(fft(y)); %Retain Magnitude
y_fft = y_fft(1:Nsamps/2); %Discard Half of Points
f = Fs*(0:Nsamps/2-1)/Nsamps; %Prepare freq data for plot
三、原始声音图像
四、频谱-时间瀑布图实现
由于瀑布图的参数需要统一行与列,且第三个值需为正数,不能为负数(可通过负数取实部实现绘制)
[S,F,T]=specgram(y,2048,16000,2048,1536);
%y为波形数据
%FFT帧长2048点(在16000Hz频率时约为46ms)
%采样频率16000KHz
%加窗长度,一般与帧长相等
%帧重叠长度,此处取为帧长的3/4
[t_test,f_test] = meshgrid(T,F);
y_test=abs(S);
figure(5);
waterfall(t_test,f_test,y_test);
ylabel('Frequency (Hz)')
xlabel('time(s)')
zlabel('Amplitude')
title('频谱-时间瀑布图')
五、瀑布图
waterfall需要z轴数据需要与x轴和y轴关联,如sin(2pif_test.*t_test)函数所示
t_test = 0:0.01:1;
f_test = 1:5;
[t_test,f_test] = meshgrid(t_test,f_test);
y_test=sin(2*pi*f_test.*t_test);%%z轴的数据
figure(4);
waterfall(t_test,f_test,y_test);
[y_orignal,Fs] = audioread('BAC009S0002W0122.wav');
y = y_orignal(1:50000);
Nsamps = length(y);
t_orignal = (1/Fs)*(1:Nsamps); %Prepare time data for plot
t = t_orignal(1:50000);
%Do Fourier Transform
y_fft = abs(fft(y)); %Retain Magnitude
y_fft = y_fft(1:Nsamps/2); %Discard Half of Points
f = Fs*(0:Nsamps/2-1)/Nsamps; %Prepare freq data for plot
%Plot Sound File in Time Domain
figure(1)
plot(t, y)
xlabel('Time (s)')
ylabel('Amplitude')
title('Tuning Fork A4 in Time Domain')
%Plot Sound File in Frequency Domain
figure(2)
plot(f, y_fft)
xlim([0 1000])
xlabel('Frequency (Hz)')
ylabel('Amplitude')
title('Frequency Response of Tuning Fork A4')
figure(3)
%[t,f] = meshgrid(t,f);
z_fft = y_fft';
%waterfall(t,f,z_fft);%x轴为z的行,y为z的列
%%%%语谱图
specgram(y,2048,16000,2048,1536);
%y为波形数据
%FFT帧长2048点(在16000Hz频率时约为46ms)
%采样频率16000KHz
%加窗长度,一般与帧长相等
%帧重叠长度,此处取为帧长的3/4
[S,F,T]=specgram(y,2048,16000,2048,1536);
%y为波形数据
%FFT帧长2048点(在16000Hz频率时约为46ms)
%采样频率16000KHz
%加窗长度,一般与帧长相等
%帧重叠长度,此处取为帧长的3/4
[t_test,f_test] = meshgrid(T,F);
y_test=abs(S);
figure(5);
waterfall(t_test,f_test,y_test);
ylabel('Frequency (Hz)')
xlabel('time(s)')
zlabel('Amplitude')
title('频谱-时间瀑布图')
t_test = 0:0.01:1;
f_test = 1:5;
[t_test,f_test] = meshgrid(t_test,f_test);
y_test=sin(2*pi*f_test.*t_test);
figure(4);
waterfall(t_test,f_test,y_test);