Chap 7


Dynamic Time Warping in Speech Recognition

Experiment with the matlab script IsoDigitRec.m to match template recordings of digits zero.wav, one.wav, …

to a set of template patterns provided in the textbook’s software

small fixes to IsoDigitRec.m

  1. ind=strfind(curDir,'\'); is changed to ind=strfind(curDir,'/');
  2. [x, Fs, bits] = wavread()will be removed, so replace them with [x,Fs]=audioread
  3. Change protoNames={'zero', ...} to protoNames={'zero.wav',...}accordingly

fixed IsoDigitRec.m

% IsoDigitRec.m (Example 5.4)
% "Introduction to Pattern Recognition: A MATLAB Approach"
% S. Theodoridis, A. Pikrakis, K. Koutroumbas, D. Cavouras
% At a first step, the data folder of Chapter 5 is appended to the existing 
% MATLAB path.
addpath([curDir 'data'],'-end');
% To build the system, we will use short-term Energy and short-term Zero-
% Crossing Rate (Section 7.5.4, [Theo 09]) as features, so that each signal is rep-
% resented by a sequence of two-dimensional feature vectors. Note that this is not
% an optimal feature set in any sense and it has only been adopted on the basis of
% simplicity. The feature extraction stage is accomplished by typing the following
% code:
for i=1:length(protoNames)
    winlength = round(0.02*Fs); % 20 ms moving window length
    winstep = winlength; % moving window step. No overlap
% To find the best match for an unknown pattern, say a pattern stored in file
% "upattern1.wav", type the following code:
winlength = round(0.02*Fs); % use the same values as before
winstep = winlength;
LeftEndConstr=round(tolerance/winstep); % left endpoint constraint
RightEndConstr = LeftEndConstr;
for i=1:length(protoNames)
fprintf('The unknown pattern has been identified as a "%s" \n',protoNames{indexofBest});

Result for Patterns

Change [test,Fs]=audioread('upattern1.wav'); to [test,Fs]=audioread('upattern02.wav'); , etc.

Then get Result for Patterns:

Name of Pattern Identified as
upattern1.wav zero.wav
upattern02.wav zero.wav
upattern11.wav zero.wav
upattern12.wav one.wav
upattern13.wav one.wav
upattern14.wav three.wav
upattern15.wav zero.wav
upattern16.wav four.wav
upattern17.wav four.wav
upattern21.wav three.wav
upattern22.wav two.wav
upattern23.wav two.wav
upattern51.wav five.wav
upattern61.wav six.wav

Chap 8


HMM recognition and training

Run example633.m, example634.m, example635.m and example636.m


put BackTracking.m of Chap 5 into Chap 6 function&example folder

Because MultSeqTrainDoHMMVITsc.m use function BackTracking.m

content of functions

% CHAPTER 6: m-files
%   BWDoHMMsc              - Computes the recognition probability of a HMM, given a sequence of   %                            discrete observations, by means of the scaled version of the Baum-   %                            Welch (any-path) method
%   BWDoHMMst              - Same as BWDoHMMSc, except that no scaling is employed.
%   MultSeqTrainCoHMMBWsc  - Baum-Welch training (scaled version) of a Continuous Observation
%                            HMM, given multiple training sequences. Each sequence 
%                            consists of l-dimensional feature vectors.
%                            It is assumed that the pdf associated with each state 
%                            is a multivariate Gaussian mixture. 
%   MultSeqTrainDoHMMBWsc  - Baum-Welch training (scaled version) of a Discrete Observation
%                            HMM, given multiple training sequences.
%   MultSeqTrainDoHMMVITsc - Viterbi training (scaled version) of a Discrete ObservationHMM,given
%                            multiple training sequences.
%   VitCoHMMsc             - Computes the scaledViterbi score of aHMM,given a sequence of l-       %                            dimensional vectors
%                            of continuous observations, under the assumption that the pdf 
%                            of each state is a Gaussian mixture.
%   VitCoHMMst             - Same as VitCoHMMsc except that no scaling is employed.
%   VitDoHMMsc             - Computes the scaled Viterbi score of a Discrete Observation HMM, 
%                            given a sequence of observations.
%   VitDoHMMst             - Same as VitDoHMMsc, except that no scaling is employed.

Result for example633.m

epoch =     1
epoch =     2
piTrained_1 =
ATrained_1 =
    0.6743    0.3257
    0.6746    0.3254
BTrained_1 =
    0.7672    0.3544
    0.2328    0.6456
% press any key    
epoch =     1
epoch =     2
epoch =     3
epoch =     4
epoch =     5
epoch =     6
epoch =     7
epoch =     8
epoch =     9
epoch =    10
epoch =    11
epoch =    12
epoch =    13
piTrained_2 =
ATrained_2 =
    1.0000    0.0000
         0    1.0000
BTrained_2 =
    0.6333         0
    0.3667    1.0000

Result for example634.m

theEpoch =     1
theEpoch =     2
piTrained_1 =
ATrained_1 =
    0.6278    0.3722
    0.6288    0.3712
BTrained_1 =
     1     0
     0     1

Result for example635.m

epoch =     1
epoch =     2
piTrained_1 =
ATrained_1 =
    0.6743    0.3257
    0.6746    0.3254
BTrained_1 =
    0.7672    0.3544
    0.2328    0.6456
% press any key
epoch =     1
epoch =     2
epoch =     3
epoch =     4
epoch =     5
epoch =     6
epoch =     7
epoch =     8
epoch =     9
epoch =    10
epoch =    11
epoch =    12
epoch =    13
piTrained_2 =
ATrained_2 =
    1.0000    0.0000
         0    1.0000
BTrained_2 =
    0.6333         0
    0.3667    1.0000

Result for example636.m

Pr1 =   -8.8513
Pr2 =  -15.1390
bs1 =
     1     1     1     1     1     1     1     2     2     2     2     2     2
bs2 =
     1     2     2     2     2     2     2     2     2     2     2     2     2
