本项目实现需要voicebox模块,附网址:
http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
声纹识别中常用到的特征主要有MFCC、和LPC。本文采取的MFCC特征。
function [ mfcc_feature ] = get_features( voice_data, fs )
%GET_FEATURES 提取语音信号的MFCC特征
a = 0.92; %预加重系数 0.9 < a < 1。
voice_data = filter([1 - a],1,voice_data);%预加重
mfcc_feature = melcepst(voice_data, fs); % 提取MFCC特征
end
%% 初始化
GMM_order = 10;
train_path = './train';
test_path = './test'
train_info = dir(train_path);
n_speakers = length(train_info) - 2;
test_info = dir(strcat(test_path, '/*.wav'));
%% MFCC特征
features = cell(1,n_speakers);
for i=1:n_speakers
tem_info = dir(strcat(train_info(2 + i).folder, '/', train_info(2 + i).name));
for j=1:length(tem_info) - 2
[voice_data, Fs] = audioread(strcat(tem_info(2 + j).folder, '/', tem_info(2 + j).name));
if j==1
mfcc_features = get_features(voice_data, Fs);
else
mfcc_features = [mfcc_features;get_features(voice_data, Fs)];
end
features{i} = mfcc_features;
end
end
%%
%模型训练
GMModels = cell(1, n_speakers);
options = struct('MaxIter',{2000});
epochs = 10;
for i=1:n_speakers
GMModels{i} = fitgmdist(features{i}, GMM_order, 'RegularizationValue', 0.001, 'SharedCov', true, 'Options', options, 'Start', 'plus', 'Replicates', epochs);
end
%% 测试过程
for i=1:length(test_info)
[voice_data, Fs] = audioread(strcat(test_info(i).folder, '/', test_info(i).name));
mfcc_features = get_features(voice_data, Fs);
[d1, log1] = posterior(GMModels{1}, mfcc_features);
[d2, log2] = posterior(GMModels{2}, mfcc_features);
if log1 < log2
fprintf(test_info(i).name);fprintf(' label: 1');
fprintf('\n');
else
fprintf(test_info(i).name);fprintf(' label: 2');
fprintf('\n');
end
end
文件的路径设置如下
train文件夹下要包含各个说话人的文件夹,每个文件夹独立存在且包含各自的.wav训练语音。
test文件夹下包含各个待识别的.wav文件。