Kaldi单步完美运行AIShell v1 S5之五:DNN (chain)

Kaldi单步完美运行AIShell v1 S5之五:DNN(chain)

  • 致谢
  • 机器配置
    • 问题:显卡设备老旧,一个GPU,想跑tdnn模型,如何破?
  • Kaldi单步完美运行AIShell v1 S5之五:DNN (chain)
    • 第14部分:DNN Chain Model
    • 第12部分:Chain训练、解码、校准
    • 第15部分:迭代

致谢

感谢AIShell在商业化道路上的探索。期待着v3的到来。

机器配置

sv@HP:~$ sudo lsb_release -a
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.1 LTS
Release:	18.04
Codename:	bionic

sv@HP:~$ cat /proc/cpuinfo | grep model\ name
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
sv@HP:~$ cat /proc/meminfo | grep MemTotal
MemTotal:       16321360 kB
sv@HP:~$ lspci | grep 'VGA'
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)

问题:显卡设备老旧,一个GPU,想跑tdnn模型,如何破?

**解答:**
将num-jobs-initial和num-jobs-final都设为1,将epochs改为2或者3GPU设为独占。
sv@HP:~/lkaldi/egs/aishell/s5$ sudo nvidia-smi -c 3
[sudo] password for sv: 
Set compute mode to EXCLUSIVE_PROCESS for GPU 00000000:01:00.0.
All done.
sv@HP:~/lkaldi/egs/aishell/s5$ sudo nvidia-smi
Wed Jan 16 10:31:58 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.78       Driver Version: 410.78       CUDA Version: 10.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1070    Off  | 00000000:01:00.0  On |                  N/A |
| 27%   31C    P8     7W / 151W |    225MiB /  8116MiB |      0%   E. Process |
+-------------------------------+----------------------+----------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      1432      G   /usr/lib/xorg/Xorg                           125MiB |
|    0      1645      G   /usr/bin/gnome-shell                          94MiB |
|    0      2622      G   /opt/firefox/firefox-bin                       3MiB |
+-----------------------------------------------------------------------------+

Kaldi单步完美运行AIShell v1 S5之五:DNN (chain)

终篇。Chain Model的结果可以线上、实时,才有独立的商用价值。

第14部分:DNN Chain Model

先看结果。DNN nnet3错词率降到8.68%。Chain降到7.72%。

sv@HP:~/lkaldi/egs/aishell/s5$ for x in exp/ */decode_test; do [ -d $x ] && grep WER $x/cer_* | utils/best_wer.sh; done 2>/dev/null

%WER 36.59 [ 38335 / 104765, 849 ins, 3183 del, 34303 sub ] exp/mono/decode_test/cer_10_0.0
%WER 18.83 [ 19727 / 104765, 971 ins, 1161 del, 17595 sub ] exp/tri1/decode_test/cer_13_0.5
%WER 18.79 [ 19684 / 104765, 957 ins, 1142 del, 17585 sub ] exp/tri2/decode_test/cer_14_0.5
%WER 16.84 [ 17643 / 104765, 791 ins, 991 del, 15861 sub ] exp/tri3a/decode_test/cer_14_0.5
%WER 13.63 [ 14277 / 104765, 762 ins, 639 del, 12876 sub ] exp/tri4a/decode_test/cer_13_0.5
%WER 8.68 [ 9097 / 104765, 355 ins, 464 del, 8278 sub ] exp/nnet3/tdnn_sp/decode_test/cer_14_1.0
%WER 7.72 [ 8087 / 104765, 364 ins, 552 del, 7171 sub ] exp/chain/tdnn_1a_sp/decode_test/cer_11_0.5

第12部分:Chain训练、解码、校准

sv@HP:~/lkaldi/egs/aishell/s5$ local/chain/run_tdnn.sh
local/chain/run_tdnn.sh 
local/nnet3/run_ivector_common.sh: preparing directory for low-resolution speed-perturbed data (for alignment)
utils/data/perturb_data_dir_speed_3way.sh: data/train_sp/feats.scp already exists: refusing to run this (please delete data/train_sp/feats.scp if you want this to run)
sv@HP:~/lkaldi/egs/aishell/s5$ local/chain/run_tdnn.sh
local/chain/run_tdnn.sh 
local/nnet3/run_ivector_common.sh: preparing directory for low-resolution speed-perturbed data (for alignment)
utils/data/perturb_data_dir_speed_3way.sh: making sure the utt2dur and the reco2dur files are present
... in data/train, because obtaining it after speed-perturbing
... would be very slow, and you might need them.
utils/data/get_utt2dur.sh: data/train/utt2dur already exists with the expected length.  We won't recompute it.
utils/data/get_reco2dur.sh: data/train/reco2dur already exists with the expected length.  We won't recompute it.
utils/data/perturb_data_dir_speed.sh: generated speed-perturbed version of data in data/train, in data/train_sp_speed0.9
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_speed0.9
utils/data/perturb_data_dir_speed.sh: generated speed-perturbed version of data in data/train, in data/train_sp_speed1.1
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_speed1.1
utils/data/combine_data.sh data/train_sp data/train data/train_sp_speed0.9 data/train_sp_speed1.1
utils/data/combine_data.sh: combined utt2uniq
utils/data/combine_data.sh [info]: not combining segments as it does not exist
utils/data/combine_data.sh: combined utt2spk
utils/data/combine_data.sh [info]: not combining utt2lang as it does not exist
utils/data/combine_data.sh: combined utt2dur
utils/data/combine_data.sh: combined reco2dur
utils/data/combine_data.sh [info]: **not combining feats.scp as it does not exist everywhere**
utils/data/combine_data.sh: combined text
utils/data/combine_data.sh [info]: **not combining cmvn.scp as it does not exist everywhere**
utils/data/combine_data.sh [info]: not combining vad.scp as it does not exist
utils/data/combine_data.sh [info]: not combining reco2file_and_channel as it does not exist
utils/data/combine_data.sh: combined wav.scp
utils/data/combine_data.sh [info]: not combining spk2gender as it does not exist
fix_data_dir.sh: kept all 360294 utterances.
fix_data_dir.sh: old files are kept in data/train_sp/.backup
utils/data/perturb_data_dir_speed_3way.sh: generated 3-way speed-perturbed version of data in data/train, in data/train_sp
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp
local/nnet3/run_ivector_common.sh: making MFCC features for low-resolution speed-perturbed data
steps/make_mfcc_pitch.sh --cmd run.pl --mem 8G --nj 70 data/train_sp exp/make_mfcc/train_sp mfcc_perturbed
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for train_sp
steps/compute_cmvn_stats.sh data/train_sp exp/make_mfcc/train_sp mfcc_perturbed
Succeeded creating CMVN stats for train_sp
fix_data_dir.sh: kept all 360294 utterances.
fix_data_dir.sh: old files are kept in data/train_sp/.backup
local/nnet3/run_ivector_common.sh: aligning with the perturbed low-resolution data
steps/align_fmllr.sh --nj 30 --cmd run.pl --mem 8G data/train_sp data/lang exp/tri5a exp/tri5a_sp_ali
steps/align_fmllr.sh: feature type is lda
steps/align_fmllr.sh: compiling training graphs
steps/align_fmllr.sh: aligning data in data/train_sp using exp/tri5a/final.alimdl and speaker-independent features.
steps/align_fmllr.sh: computing fMLLR transforms
steps/align_fmllr.sh: doing final alignment.
steps/align_fmllr.sh: done aligning data.
steps/diagnostic/analyze_alignments.sh --cmd run.pl --mem 8G data/lang exp/tri5a_sp_ali
steps/diagnostic/analyze_alignments.sh: see stats in exp/tri5a_sp_ali/log/analyze_alignments.log
404 warnings in exp/tri5a_sp_ali/log/align_pass2.*.log
2 warnings in exp/tri5a_sp_ali/log/fmllr.*.log
387 warnings in exp/tri5a_sp_ali/log/align_pass1.*.log
local/nnet3/run_ivector_common.sh: creating high-resolution MFCC features
utils/copy_data_dir.sh: copied data from data/train_sp to data/train_sp_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_hires
utils/copy_data_dir.sh: copied data from data/dev to data/dev_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/dev_hires
utils/copy_data_dir.sh: copied data from data/test to data/test_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/test_hires
utils/data/perturb_data_dir_volume.sh: data/train_sp_hires/feats.scp exists; moving it to data/train_sp_hires/.backup/ as it wouldn't be valid any more.
utils/data/perturb_data_dir_volume.sh: added volume perturbation to the data in data/train_sp_hires
steps/make_mfcc_pitch.sh --nj 10 --mfcc-config conf/mfcc_hires.conf --cmd run.pl --mem 8G data/train_sp_hires exp/make_hires/train_sp mfcc_perturbed_hires
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_hires
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for train_sp_hires
steps/compute_cmvn_stats.sh data/train_sp_hires exp/make_hires/train_sp mfcc_perturbed_hires
Succeeded creating CMVN stats for train_sp_hires
fix_data_dir.sh: kept all 360294 utterances.
fix_data_dir.sh: old files are kept in data/train_sp_hires/.backup
utils/copy_data_dir.sh: copied data from data/train_sp_hires to data/train_sp_hires_nopitch
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_hires_nopitch
utils/data/limit_feature_dim.sh: warning: removing data/train_sp_hires_nopitch/cmvn.cp, you will have to regenerate it from the features.
utils/validate_data_dir.sh: Successfully validated data-directory data/train_sp_hires_nopitch
steps/compute_cmvn_stats.sh data/train_sp_hires_nopitch exp/make_hires/train_sp mfcc_perturbed_hires
Succeeded creating CMVN stats for train_sp_hires_nopitch
steps/make_mfcc_pitch.sh --nj 10 --mfcc-config conf/mfcc_hires.conf --cmd run.pl --mem 8G data/dev_hires exp/make_hires/dev mfcc_perturbed_hires
steps/make_mfcc_pitch.sh: moving data/dev_hires/feats.scp to data/dev_hires/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/dev_hires
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for dev_hires
steps/compute_cmvn_stats.sh data/dev_hires exp/make_hires/dev mfcc_perturbed_hires
Succeeded creating CMVN stats for dev_hires
fix_data_dir.sh: kept all 14326 utterances.
fix_data_dir.sh: old files are kept in data/dev_hires/.backup
utils/copy_data_dir.sh: copied data from data/dev_hires to data/dev_hires_nopitch
utils/validate_data_dir.sh: Successfully validated data-directory data/dev_hires_nopitch
utils/data/limit_feature_dim.sh: warning: removing data/dev_hires_nopitch/cmvn.cp, you will have to regenerate it from the features.
utils/validate_data_dir.sh: Successfully validated data-directory data/dev_hires_nopitch
steps/compute_cmvn_stats.sh data/dev_hires_nopitch exp/make_hires/dev mfcc_perturbed_hires
Succeeded creating CMVN stats for dev_hires_nopitch
steps/make_mfcc_pitch.sh --nj 10 --mfcc-config conf/mfcc_hires.conf --cmd run.pl --mem 8G data/test_hires exp/make_hires/test mfcc_perturbed_hires
steps/make_mfcc_pitch.sh: moving data/test_hires/feats.scp to data/test_hires/.backup
utils/validate_data_dir.sh: Successfully validated data-directory data/test_hires
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for test_hires
steps/compute_cmvn_stats.sh data/test_hires exp/make_hires/test mfcc_perturbed_hires
Succeeded creating CMVN stats for test_hires
fix_data_dir.sh: kept all 7176 utterances.
fix_data_dir.sh: old files are kept in data/test_hires/.backup
utils/copy_data_dir.sh: copied data from data/test_hires to data/test_hires_nopitch
utils/validate_data_dir.sh: Successfully validated data-directory data/test_hires_nopitch
utils/data/limit_feature_dim.sh: warning: removing data/test_hires_nopitch/cmvn.cp, you will have to regenerate it from the features.
utils/validate_data_dir.sh: Successfully validated data-directory data/test_hires_nopitch
steps/compute_cmvn_stats.sh data/test_hires_nopitch exp/make_hires/test mfcc_perturbed_hires
Succeeded creating CMVN stats for test_hires_nopitch
local/nnet3/run_ivector_common.sh: computing a subset of data to train the diagonal UBM.
utils/data/subset_data_dir.sh: reducing #utt from 360294 to 90073
local/nnet3/run_ivector_common.sh: computing a PCA transform from the hires data.
steps/online/nnet2/get_pca_transform.sh --cmd run.pl --mem 8G --splice-opts --left-context=3 --right-context=3 --max-utts 10000 --subsample 2 exp/nnet3/diag_ubm/train_sp_hires_nopitch_subset exp/nnet3/pca_transform
Done estimating PCA transform in exp/nnet3/pca_transform
local/nnet3/run_ivector_common.sh: training the diagonal UBM.
steps/online/nnet2/train_diag_ubm.sh --cmd run.pl --mem 8G --nj 30 --num-frames 700000 --num-threads 8 exp/nnet3/diag_ubm/train_sp_hires_nopitch_subset 512 exp/nnet3/pca_transform exp/nnet3/diag_ubm
steps/online/nnet2/train_diag_ubm.sh: Directory exp/nnet3/diag_ubm already exists. Backing up diagonal UBM in exp/nnet3/diag_ubm/backup.wLX
steps/online/nnet2/train_diag_ubm.sh: initializing model from E-M in memory, 
steps/online/nnet2/train_diag_ubm.sh: starting from 256 Gaussians, reaching 512;
steps/online/nnet2/train_diag_ubm.sh: for 20 iterations, using at most 700000 frames of data
Getting Gaussian-selection info
steps/online/nnet2/train_diag_ubm.sh: will train for 4 iterations, in parallel over
steps/online/nnet2/train_diag_ubm.sh: 30 machines, parallelized with 'run.pl --mem 8G'
steps/online/nnet2/train_diag_ubm.sh: Training pass 0
steps/online/nnet2/train_diag_ubm.sh: Training pass 1
steps/online/nnet2/train_diag_ubm.sh: Training pass 2
steps/online/nnet2/train_diag_ubm.sh: Training pass 3
local/nnet3/run_ivector_common.sh: training the iVector extractor
steps/online/nnet2/train_ivector_extractor.sh --cmd run.pl --mem 8G --nj 10 data/train_sp_hires_nopitch exp/nnet3/diag_ubm exp/nnet3/extractor
steps/online/nnet2/train_ivector_extractor.sh: Directory exp/nnet3/extractor already exists. Backing up iVector extractor in exp/nnet3/extractor/backup.FP5
steps/online/nnet2/train_ivector_extractor.sh: doing Gaussian selection and posterior computation
Accumulating stats (pass 0)
Summing accs (pass 0)
Updating model (pass 0)
Accumulating stats (pass 1)
Summing accs (pass 1)
Updating model (pass 1)
Accumulating stats (pass 2)
Summing accs (pass 2)
Updating model (pass 2)
Accumulating stats (pass 3)
Summing accs (pass 3)
Updating model (pass 3)
Accumulating stats (pass 4)
Summing accs (pass 4)
Updating model (pass 4)
Accumulating stats (pass 5)
Summing accs (pass 5)
Updating model (pass 5)
Accumulating stats (pass 6)
Summing accs (pass 6)
Updating model (pass 6)
Accumulating stats (pass 7)
Summing accs (pass 7)
Updating model (pass 7)
Accumulating stats (pass 8)
Summing accs (pass 8)
Updating model (pass 8)
Accumulating stats (pass 9)
Summing accs (pass 9)
Updating model (pass 9)
utils/data/modify_speaker_info.sh: copied data from data/train_sp_hires_nopitch to exp/nnet3/ivectors_train_sp/train_sp_sp_hires_nopitch_max2, number of speakers changed from 1020 to 180399
utils/validate_data_dir.sh: Successfully validated data-directory exp/nnet3/ivectors_train_sp/train_sp_sp_hires_nopitch_max2
steps/online/nnet2/extract_ivectors_online.sh --cmd run.pl --mem 8G --nj 30 exp/nnet3/ivectors_train_sp/train_sp_sp_hires_nopitch_max2 exp/nnet3/extractor exp/nnet3/ivectors_train_sp
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/nnet3/ivectors_train_sp using the extractor in exp/nnet3/extractor.
steps/online/nnet2/extract_ivectors_online.sh --cmd run.pl --mem 8G --nj 8 data/dev_hires_nopitch exp/nnet3/extractor exp/nnet3/ivectors_dev
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/nnet3/ivectors_dev using the extractor in exp/nnet3/extractor.
steps/online/nnet2/extract_ivectors_online.sh --cmd run.pl --mem 8G --nj 8 data/test_hires_nopitch exp/nnet3/extractor exp/nnet3/ivectors_test
steps/online/nnet2/extract_ivectors_online.sh: extracting iVectors
steps/online/nnet2/extract_ivectors_online.sh: combining iVectors across jobs
steps/online/nnet2/extract_ivectors_online.sh: done extracting (online) iVectors to exp/nnet3/ivectors_test using the extractor in exp/nnet3/extractor.
steps/align_fmllr_lats.sh --nj 30 --cmd run.pl --mem 8G data/train_sp data/lang exp/tri5a exp/tri5a_sp_lats
steps/align_fmllr_lats.sh: feature type is lda
steps/align_fmllr_lats.sh: compiling training graphs
steps/align_fmllr_lats.sh: aligning data in data/train_sp using exp/tri5a/final.alimdl and speaker-independent features.
steps/align_fmllr_lats.sh: computing fMLLR transforms
steps/align_fmllr_lats.sh: generating lattices containing alternate pronunciations.
lsteps/align_fmllr_lats.sh: done generating lattices from training transcripts.
1 warnings in exp/tri5a_sp_lats/log/generate_lattices.*.log
2 warnings in exp/tri5a_sp_lats/log/fmllr.*.log
399 warnings in exp/tri5a_sp_lats/log/align_pass1.*.log
steps/nnet3/chain/build_tree.sh --frame-subsampling-factor 3 --context-opts --context-width=2 --central-position=1 --cmd run.pl --mem 8G 5000 data/train_sp data/lang_chain exp/tri5a_sp_ali exp/chain/tri6_7d_tree_sp
steps/nnet3/chain/build_tree.sh: feature type is lda
steps/nnet3/chain/build_tree.sh: Using transforms from exp/tri5a_sp_ali
steps/nnet3/chain/build_tree.sh: Initializing monophone model (for alignment conversion, in case topology changed)
steps/nnet3/chain/build_tree.sh: Accumulating tree stats
steps/nnet3/chain/build_tree.sh: Getting questions for tree clustering.
steps/nnet3/chain/build_tree.sh: Building the tree
steps/nnet3/chain/build_tree.sh: Initializing the model
steps/nnet3/chain/build_tree.sh: Converting alignments from exp/tri5a_sp_ali to use current tree
steps/nnet3/chain/build_tree.sh: Done building tree
local/chain/run_tdnn.sh: creating neural net configs using the xconfig parser
tree-info exp/chain/tri6_7d_tree_sp/tree 
steps/nnet3/xconfig_to_configs.py --xconfig-file exp/chain/tdnn_1a_sp/configs/network.xconfig --config-dir exp/chain/tdnn_1a_sp/configs/
nnet3-init exp/chain/tdnn_1a_sp/configs//init.config exp/chain/tdnn_1a_sp/configs//init.raw 
LOG (nnet3-init[5.5.164~1-9698]:main():nnet3-init.cc:80) Initialized raw neural net and wrote it to exp/chain/tdnn_1a_sp/configs//init.raw
nnet3-info exp/chain/tdnn_1a_sp/configs//init.raw 
nnet3-init exp/chain/tdnn_1a_sp/configs//ref.config exp/chain/tdnn_1a_sp/configs//ref.raw 
LOG (nnet3-init[5.5.164~1-9698]:main():nnet3-init.cc:80) Initialized raw neural net and wrote it to exp/chain/tdnn_1a_sp/configs//ref.raw
nnet3-info exp/chain/tdnn_1a_sp/configs//ref.raw 
nnet3-init exp/chain/tdnn_1a_sp/configs//ref.config exp/chain/tdnn_1a_sp/configs//ref.raw 
LOG (nnet3-init[5.5.164~1-9698]:main():nnet3-init.cc:80) Initialized raw neural net and wrote it to exp/chain/tdnn_1a_sp/configs//ref.raw
nnet3-info exp/chain/tdnn_1a_sp/configs//ref.raw 
2019-01-16 20:02:01,589 [steps/nnet3/chain/train.py:35 - <module> - INFO ] Starting chain model trainer (train.py)
steps/nnet3/chain/train.py --stage -10 --cmd run.pl --mem 8G --feat.online-ivector-dir exp/nnet3/ivectors_train_sp --feat.cmvn-opts --norm-means=false --norm-vars=false --chain.xent-regularize 0.1 --chain.leaky-hmm-coefficient 0.1 --chain.l2-regularize 0.00005 --chain.apply-deriv-weights false --chain.lm-opts=--num-extra-lm-states=2000 --egs.dir  --egs.stage -10 --egs.opts --frames-overlap-per-eg 0 --egs.chunk-width 150,110,90 --trainer.num-chunk-per-minibatch 128 --trainer.frames-per-iter 1500000 --trainer.num-epochs 2 --trainer.optimization.num-jobs-initial 1 --trainer.optimization.num-jobs-final 1 --trainer.optimization.initial-effective-lrate 0.001 --trainer.optimization.final-effective-lrate 0.0001 --trainer.max-param-change 2.0 --cleanup.remove-egs true --feat-dir data/train_sp_hires --tree-dir exp/chain/tri6_7d_tree_sp --lat-dir exp/tri5a_sp_lats --dir exp/chain/tdnn_1a_sp
['steps/nnet3/chain/train.py', '--stage', '-10', '--cmd', 'run.pl --mem 8G', '--feat.online-ivector-dir', 'exp/nnet3/ivectors_train_sp', '--feat.cmvn-opts', '--norm-means=false --norm-vars=false', '--chain.xent-regularize', '0.1', '--chain.leaky-hmm-coefficient', '0.1', '--chain.l2-regularize', '0.00005', '--chain.apply-deriv-weights', 'false', '--chain.lm-opts=--num-extra-lm-states=2000', '--egs.dir', '', '--egs.stage', '-10', '--egs.opts', '--frames-overlap-per-eg 0', '--egs.chunk-width', '150,110,90', '--trainer.num-chunk-per-minibatch', '128', '--trainer.frames-per-iter', '1500000', '--trainer.num-epochs', '2', '--trainer.optimization.num-jobs-initial', '1', '--trainer.optimization.num-jobs-final', '1', '--trainer.optimization.initial-effective-lrate', '0.001', '--trainer.optimization.final-effective-lrate', '0.0001', '--trainer.max-param-change', '2.0', '--cleanup.remove-egs', 'true', '--feat-dir', 'data/train_sp_hires', '--tree-dir', 'exp/chain/tri6_7d_tree_sp', '--lat-dir', 'exp/tri5a_sp_lats', '--dir', 'exp/chain/tdnn_1a_sp']
2019-01-16 20:02:01,649 [steps/nnet3/chain/train.py:273 - train - INFO ] Arguments for the experiment
{'alignment_subsampling_factor': 3,
 'apply_deriv_weights': False,
 'backstitch_training_interval': 1,
 'backstitch_training_scale': 0.0,
 'chunk_left_context': 0,
 'chunk_left_context_initial': -1,
 'chunk_right_context': 0,
 'chunk_right_context_final': -1,
 'chunk_width': '150,110,90',
 'cleanup': True,
 'cmvn_opts': '--norm-means=false --norm-vars=false',
 'combine_sum_to_one_penalty': 0.0,
 'command': 'run.pl --mem 8G',
 'compute_per_dim_accuracy': False,
 'deriv_truncate_margin': None,
 'dir': 'exp/chain/tdnn_1a_sp',
 'do_final_combination': True,
 'dropout_schedule': None,
 'egs_command': None,
 'egs_dir': None,
 'egs_opts': '--frames-overlap-per-eg 0',
 'egs_stage': -10,
 'email': None,
 'exit_stage': None,
 'feat_dir': 'data/train_sp_hires',
 'final_effective_lrate': 0.0001,
 'frame_subsampling_factor': 3,
 'frames_per_iter': 1500000,
 'initial_effective_lrate': 0.001,
 'input_model': None,
 'l2_regularize': 5e-05,
 'lat_dir': 'exp/tri5a_sp_lats',
 'leaky_hmm_coefficient': 0.1,
 'left_deriv_truncate': None,
 'left_tolerance': 5,
 'lm_opts': '--num-extra-lm-states=2000',
 'max_lda_jobs': 10,
 'max_models_combine': 20,
 'max_objective_evaluations': 30,
 'max_param_change': 2.0,
 'momentum': 0.0,
 'num_chunk_per_minibatch': '128',
 'num_epochs': 2.0,
 'num_jobs_final': 1,
 'num_jobs_initial': 1,
 'online_ivector_dir': 'exp/nnet3/ivectors_train_sp',
 'preserve_model_interval': 100,
 'presoftmax_prior_scale_power': -0.25,
 'proportional_shrink': 0.0,
 'rand_prune': 4.0,
 'remove_egs': True,
 'reporting_interval': 0.1,
 'right_tolerance': 5,
 'samples_per_iter': 400000,
 'shrink_saturation_threshold': 0.4,
 'shrink_value': 1.0,
 'shuffle_buffer_size': 5000,
 'srand': 0,
 'stage': -10,
 'train_opts': [],
 'tree_dir': 'exp/chain/tri6_7d_tree_sp',
 'use_gpu': 'yes',
 'xent_regularize': 0.1}
2019-01-16 20:02:07,967 [steps/nnet3/chain/train.py:327 - train - INFO ] Creating phone language-model
2019-01-16 20:02:14,455 [steps/nnet3/chain/train.py:332 - train - INFO ] Creating denominator FST
copy-transition-model exp/chain/tri6_7d_tree_sp/final.mdl exp/chain/tdnn_1a_sp/0.trans_mdl 
LOG (copy-transition-model[5.5.164~1-9698]:main():copy-transition-model.cc:62) Copied transition model.
2019-01-16 20:02:15,517 [steps/nnet3/chain/train.py:339 - train - INFO ] Initializing a basic network for estimating preconditioning matrix
2019-01-16 20:02:15,553 [steps/nnet3/chain/train.py:361 - train - INFO ] Generating egs
steps/nnet3/chain/get_egs.sh --frames-overlap-per-eg 0 --cmd run.pl --mem 8G --cmvn-opts --norm-means=false --norm-vars=false --online-ivector-dir exp/nnet3/ivectors_train_sp --left-context 13 --right-context 13 --left-context-initial -1 --right-context-final -1 --left-tolerance 5 --right-tolerance 5 --frame-subsampling-factor 3 --alignment-subsampling-factor 3 --stage -10 --frames-per-iter 1500000 --frames-per-eg 150,110,90 --srand 0 data/train_sp_hires exp/chain/tdnn_1a_sp exp/tri5a_sp_lats exp/chain/tdnn_1a_sp/egs
File data/train_sp_hires/utt2uniq exists, so augmenting valid_uttlist to
include all perturbed versions of the same 'real' utterances.
steps/nnet3/chain/get_egs.sh: creating egs.  To ensure they are not deleted later you can do:  touch exp/chain/tdnn_1a_sp/egs/.nodelete
steps/nnet3/chain/get_egs.sh: feature type is raw
tree-info exp/chain/tdnn_1a_sp/tree 
feat-to-dim scp:exp/nnet3/ivectors_train_sp/ivector_online.scp - 
steps/nnet3/chain/get_egs.sh: working out number of frames of training data
steps/nnet3/chain/get_egs.sh: working out feature dim
steps/nnet3/chain/get_egs.sh: creating 110 archives, each with 16567 egs, with
steps/nnet3/chain/get_egs.sh:   150,110,90 labels per example, and (left,right) context = (13,13)
steps/nnet3/chain/get_egs.sh: Getting validation and training subset examples in background.
steps/nnet3/chain/get_egs.sh: Generating training examples on disk
... Getting subsets of validation examples for diagnostics and combination.
steps/nnet3/chain/get_egs.sh: recombining and shuffling order of archives on disk
steps/nnet3/chain/get_egs.sh: removing temporary archives
steps/nnet3/chain/get_egs.sh: removing temporary alignments, lattices and transforms
steps/nnet3/chain/get_egs.sh: Finished preparing training examples

第15部分:迭代

2019-01-16 20:15:29,645 [steps/nnet3/chain/train.py:410 - train - INFO ] Copying the properties from exp/chain/tdnn_1a_sp/egs to exp/chain/tdnn_1a_sp
2019-01-16 20:15:29,671 [steps/nnet3/chain/train.py:424 - train - INFO ] Computing the preconditioning matrix for input features
2019-01-16 20:16:04,298 [steps/nnet3/chain/train.py:433 - train - INFO ] Preparing the initial acoustic model.
2019-01-16 20:16:05,196 [steps/nnet3/chain/train.py:467 - train - INFO ] Training will run for 2.0 epochs = 660 iterations
2019-01-16 20:16:05,196 [steps/nnet3/chain/train.py:509 - train - INFO ] Iter: 0/659    Epoch: 0.00/2.0 (0.0% complete)    lr: 0.001000    
2019-01-16 20:16:34,711 [steps/nnet3/chain/train.py:509 - train - INFO ] Iter: 1/659    Epoch: 0.00/2.0 (0.2% complete)    lr: 0.000997    
2019-01-16 20:16:57,582 [steps/nnet3/chain/train.py:509 - train - INFO ] Iter: 2/659    Epoch: 0.01/2.0 (0.3% complete)    lr: 0.000993    
以下省略28万字左右   。。。。  请脑补  。。。
2019-01-17 00:29:47,901 [steps/nnet3/chain/train.py:509 - train - INFO ] Iter: 658/659    Epoch: 1.99/2.0 (99.7% complete)    lr: 0.000101    
2019-01-17 00:30:11,185 [steps/nnet3/chain/train.py:509 - train - INFO ] Iter: 659/659    Epoch: 2.00/2.0 (99.8% complete)    lr: 0.000100    
2019-01-17 00:30:34,175 [steps/nnet3/chain/train.py:565 - train - INFO ] Doing final combination to produce final.mdl
2019-01-17 00:30:34,175 [steps/libs/nnet3/train/chain_objf/acoustic_model.py:571 - combine_models - INFO ] Combining set([519, 527, 660, 535, 495, 543, 551, 647, 559, 567, 575, 583, 503, 591, 599, 655, 607, 615, 623, 631, 511, 639]) models.
2019-01-17 00:30:49,737 [steps/nnet3/chain/train.py:594 - train - INFO ] Cleaning up the experiment directory exp/chain/tdnn_1a_sp
steps/nnet2/remove_egs.sh: Finished deleting examples in exp/chain/tdnn_1a_sp/egs
exp/chain/tdnn_1a_sp: num-iters=660 nj=1..1 num-params=12.2M dim=43+100->4320 combine=-0.054->-0.054 (over 4) xent:train/valid[438,659]=(-0.897,-0.855/-1.06,-1.03) logprob:train/valid[438,659]=(-0.053,-0.049/-0.072,-0.071)
tree-info exp/chain/tdnn_1a_sp/tree 
tree-info exp/chain/tdnn_1a_sp/tree 
fstcomposecontext --context-size=2 --central-position=1 --read-disambig-syms=data/lang_test/phones/disambig.int --write-disambig-syms=data/lang_test/tmp/disambig_ilabels_2_1.int data/lang_test/tmp/ilabels_2_1.4603 data/lang_test/tmp/LG.fst 
fstisstochastic data/lang_test/tmp/CLG_2_1.fst 
-0.0663446 -0.0666824
[info]: CLG not stochastic.
make-h-transducer --disambig-syms-out=exp/chain/tdnn_1a_sp/graph/disambig_tid.int --transition-scale=1.0 data/lang_test/tmp/ilabels_2_1 exp/chain/tdnn_1a_sp/tree exp/chain/tdnn_1a_sp/final.mdl 
fsttablecompose exp/chain/tdnn_1a_sp/graph/Ha.fst data/lang_test/tmp/CLG_2_1.fst 
fstdeterminizestar --use-log=true 
fstrmsymbols exp/chain/tdnn_1a_sp/graph/disambig_tid.int 
fstrmepslocal 
fstminimizeencoded 
fstisstochastic exp/chain/tdnn_1a_sp/graph/HCLGa.fst 
0.393711 -0.237036
HCLGa is not stochastic
add-self-loops --self-loop-scale=1.0 --reorder=true exp/chain/tdnn_1a_sp/final.mdl exp/chain/tdnn_1a_sp/graph/HCLGa.fst 
fstisstochastic exp/chain/tdnn_1a_sp/graph/HCLG.fst 
0.177603 -0.184153
[info]: final HCLG is not stochastic.
steps/nnet3/decode.sh --acwt 1.0 --post-decode-acwt 10.0 --nj 10 --cmd run.pl --mem 8G --online-ivector-dir exp/nnet3/ivectors_dev exp/chain/tdnn_1a_sp/graph data/dev_hires exp/chain/tdnn_1a_sp/decode_dev
steps/nnet3/decode.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --mem 8G --iter final exp/chain/tdnn_1a_sp/graph exp/chain/tdnn_1a_sp/decode_dev
steps/diagnostic/analyze_lats.sh: see stats in exp/chain/tdnn_1a_sp/decode_dev/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,4,26) and mean=12.2
steps/diagnostic/analyze_lats.sh: see stats in exp/chain/tdnn_1a_sp/decode_dev/log/analyze_lattice_depth_stats.log
score best paths
+ steps/score_kaldi.sh --cmd 'run.pl --mem 8G' data/dev_hires exp/chain/tdnn_1a_sp/graph exp/chain/tdnn_1a_sp/decode_dev
steps/score_kaldi.sh --cmd run.pl --mem 8G data/dev_hires exp/chain/tdnn_1a_sp/graph exp/chain/tdnn_1a_sp/decode_dev
steps/score_kaldi.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ steps/scoring/score_kaldi_cer.sh --stage 2 --cmd 'run.pl --mem 8G' data/dev_hires exp/chain/tdnn_1a_sp/graph exp/chain/tdnn_1a_sp/decode_dev
steps/scoring/score_kaldi_cer.sh --stage 2 --cmd run.pl --mem 8G data/dev_hires exp/chain/tdnn_1a_sp/graph exp/chain/tdnn_1a_sp/decode_dev
steps/scoring/score_kaldi_cer.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ echo 'local/score.sh: Done'
local/score.sh: Done
score confidence and timing with sclite
Decoding done.
steps/nnet3/decode.sh --acwt 1.0 --post-decode-acwt 10.0 --nj 10 --cmd run.pl --mem 8G --online-ivector-dir exp/nnet3/ivectors_test exp/chain/tdnn_1a_sp/graph data/test_hires exp/chain/tdnn_1a_sp/decode_test
steps/nnet3/decode.sh: feature type is raw
steps/diagnostic/analyze_lats.sh --cmd run.pl --mem 8G --iter final exp/chain/tdnn_1a_sp/graph exp/chain/tdnn_1a_sp/decode_test
steps/diagnostic/analyze_lats.sh: see stats in exp/chain/tdnn_1a_sp/decode_test/log/analyze_alignments.log
Overall, lattice depth (10,50,90-percentile)=(1,4,39) and mean=18.6
steps/diagnostic/analyze_lats.sh: see stats in exp/chain/tdnn_1a_sp/decode_test/log/analyze_lattice_depth_stats.log
score best paths
+ steps/score_kaldi.sh --cmd 'run.pl --mem 8G' data/test_hires exp/chain/tdnn_1a_sp/graph exp/chain/tdnn_1a_sp/decode_test
steps/score_kaldi.sh --cmd run.pl --mem 8G data/test_hires exp/chain/tdnn_1a_sp/graph exp/chain/tdnn_1a_sp/decode_test
steps/score_kaldi.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ steps/scoring/score_kaldi_cer.sh --stage 2 --cmd 'run.pl --mem 8G' data/test_hires exp/chain/tdnn_1a_sp/graph exp/chain/tdnn_1a_sp/decode_test
steps/scoring/score_kaldi_cer.sh --stage 2 --cmd run.pl --mem 8G data/test_hires exp/chain/tdnn_1a_sp/graph exp/chain/tdnn_1a_sp/decode_test
steps/scoring/score_kaldi_cer.sh: scoring with word insertion penalty=0.0,0.5,1.0
+ echo 'local/score.sh: Done'
local/score.sh: Done
score confidence and timing with sclite
Decoding done.

继续:Kaldi单步完美运行AIShell v1 S5之五:chain DNN
继续:Kaldi单步完美运行AIShell v1 S5之四:nnet3 DNN
回头:Kaldi单步完美运行AIShell v1 S5之三:三音素TriPhone
回头:Kaldi单步完美运行AIShell v1 S5之二:单音素MonoPhone
回头:Kaldi单步完美运行AIShell v1 S5之一:MONO前

其他参考:Kaldi完美运行TIMIT完整结果(含DNN)

你可能感兴趣的:(Kaldi)