make_mfcc_pitch.sh阅读笔记

make_mfcc_pitch.sh阅读笔记
计算mfcc和pitch特征
调用方式: steps/make_mfcc_pitch.sh --cmd "x exp/make_mfcc/mfccdir || exit 1;

提取特征代码:
compute-mfcc-feats #提取mfcc特征
compute-kaldi-pitch-feats #提取pitch特征

特征处理代码:
paste-feats
copy-feats

用法示例:
kaldi-trunk/src/featbin/compute-mfcc-feats --use-energy=false --verbose=2 --config=../conf/mfcc.conf scp:lable_to_wav.scp ark:myfeats.ark

其中,lable_to_wav.scp:

BAC009S0002W0122 kaldi-trunk/egs/aishell/aishell-data/data_aishell/wav/train/S0002/BAC009S0002W0122.wav

提取的特征存入myfeats.ark中.

如何查看myfeats.ark?
kaldi-trunk/src/featbin/copy-feats ark:myfeats.ark ark,t:- | head

BAC009S0002W0122 [
37.94254 -14.98815 3.779812 -2.988866 6.984592 12.55763 21.91789 14.40024 0.8388216 0.7873834 18.17512 21.27639 7.671076
36.66135 -16.34631 7.26571 3.157262 7.532941 5.298943 6.635718 3.382703 -4.179379 3.944365 6.671023 5.486343 7.753448

此时提取的是MFCC特征,每帧特征长度为13维。
对特征生成scp文件:
kaldi-trunk/src/featbin/copy-feats ark:myfeats.ark ark,scp:tttt.ark,tttt.scp

输出tttt.scp:
BAC009S0002W0122 tttt.ark:17
表示BAC009S0002W0122这条音频的特征矩阵首地址存在myfeats.ark中第17个字节(不确定是否是字节,大概这个意思)开始的位置。

注:
ark:- | #标准输出
scp,p: #后跟输入文件

你可能感兴趣的:(make_mfcc_pitch.sh阅读笔记)