Kaldi单步完美运行AIShell v1 S5之一:MONO前

Kaldi单步完美运行AIShell v1 S5之一:MONO前

  • 致谢
  • 机器配置
  • Kaldi下AIShell v1详细输出
    • 第一部分:数据准备
    • 第二部分:MFCC & CMVN
    • 第三部分:单音素

致谢

感谢AIShell在商业化道路上的探索。期待着v3的到来。

机器配置

sv@HP:~$ sudo lsb_release -a
Distributor ID:	Ubuntu
Description:	Ubuntu 18.04.1 LTS
Release:	18.04
Codename:	bionic

sv@HP:~$ cat /proc/cpuinfo | grep model\ name
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
model name	: Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz
sv@HP:~$ cat /proc/meminfo | grep MemTotal
MemTotal:       16321360 kB
sv@HP:~$ lspci | grep 'VGA'
01:00.0 VGA compatible controller: NVIDIA Corporation GP104 [GeForce GTX 1070] (rev a1)

Kaldi下AIShell v1详细输出

一网打尽。

第一部分:数据准备

sv@HP:~/lkaldi/egs/aishell/s5$ data=/home/sv/lkaldi/egs/aishell/s5/dat
sv@HP:~/lkaldi/egs/aishell/s5$ . ./cmd.sh
sv@HP:~/lkaldi/egs/aishell/s5$ local/aishell_prepare_dict.sh $data/resource_aishell || exit 1;
local/aishell_prepare_dict.sh: AISHELL dict preparation succeeded
sv@HP:~/lkaldi/egs/aishell/s5$ 
sv@HP:~/lkaldi/egs/aishell/s5$ # Data Preparation,
sv@HP:~/lkaldi/egs/aishell/s5$ local/aishell_data_prep.sh $data/data_aishell/wav $data/data_aishell/transcript || exit 1;
Preparing data/local/train transcriptions
Preparing data/local/dev transcriptions
Preparing data/local/test transcriptions
local/aishell_data_prep.sh: AISHELL data preparation succeeded
sv@HP:~/lkaldi/egs/aishell/s5$ 
sv@HP:~/lkaldi/egs/aishell/s5$ # Phone Sets, questions, L compilation
sv@HP:~/lkaldi/egs/aishell/s5$ utils/prepare_lang.sh --position-dependent-phones false data/local/dict \
>     "" data/local/lang data/lang || exit 1;
utils/prepare_lang.sh --position-dependent-phones false data/local/dict <SPOKEN_NOISE> data/local/lang data/lang
Checking data/local/dict/silence_phones.txt ...
--> reading data/local/dict/silence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/silence_phones.txt is OK

Checking data/local/dict/optional_silence.txt ...
--> reading data/local/dict/optional_silence.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/optional_silence.txt is OK

Checking data/local/dict/nonsilence_phones.txt ...
--> reading data/local/dict/nonsilence_phones.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/nonsilence_phones.txt is OK

Checking disjoint: silence_phones.txt, nonsilence_phones.txt
--> disjoint property is OK.

Checking data/local/dict/lexicon.txt
--> reading data/local/dict/lexicon.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/lexicon.txt is OK

Checking data/local/dict/extra_questions.txt ...
--> reading data/local/dict/extra_questions.txt
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/local/dict/extra_questions.txt is OK
--> SUCCESS [validating dictionary directory data/local/dict]

**Creating data/local/dict/lexiconp.txt from data/local/dict/lexicon.txt
fstaddselfloops data/lang/phones/wdisambig_phones.int data/lang/phones/wdisambig_words.int 
prepare_lang.sh: validating output directory
utils/validate_lang.pl data/lang
Checking data/lang/phones.txt ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/phones.txt is OK

Checking words.txt: #0 ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> data/lang/words.txt is OK

Checking disjoint: silence.txt, nonsilence.txt, disambig.txt ...
--> silence.txt and nonsilence.txt are disjoint
--> silence.txt and disambig.txt are disjoint
--> disambig.txt and nonsilence.txt are disjoint
--> disjoint property is OK

Checking sumation: silence.txt, nonsilence.txt, disambig.txt ...
--> found no unexplainable phones in phones.txt

Checking data/lang/phones/context_indep.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.int corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.csl corresponds to data/lang/phones/context_indep.txt
--> data/lang/phones/context_indep.{txt, int, csl} are OK

Checking data/lang/phones/nonsilence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 216 entry/entries in data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.int corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.csl corresponds to data/lang/phones/nonsilence.txt
--> data/lang/phones/nonsilence.{txt, int, csl} are OK

Checking data/lang/phones/silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/silence.txt
--> data/lang/phones/silence.int corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.csl corresponds to data/lang/phones/silence.txt
--> data/lang/phones/silence.{txt, int, csl} are OK

Checking data/lang/phones/optional_silence.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.int corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.csl corresponds to data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.{txt, int, csl} are OK

Checking data/lang/phones/disambig.{txt, int, csl} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 105 entry/entries in data/lang/phones/disambig.txt
--> data/lang/phones/disambig.int corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.csl corresponds to data/lang/phones/disambig.txt
--> data/lang/phones/disambig.{txt, int, csl} are OK

Checking data/lang/phones/roots.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 67 entry/entries in data/lang/phones/roots.txt
--> data/lang/phones/roots.int corresponds to data/lang/phones/roots.txt
--> data/lang/phones/roots.{txt, int} are OK

Checking data/lang/phones/sets.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 67 entry/entries in data/lang/phones/sets.txt
--> data/lang/phones/sets.int corresponds to data/lang/phones/sets.txt
--> data/lang/phones/sets.{txt, int} are OK

Checking data/lang/phones/extra_questions.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 7 entry/entries in data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.int corresponds to data/lang/phones/extra_questions.txt
--> data/lang/phones/extra_questions.{txt, int} are OK

Checking optional_silence.txt ...
--> reading data/lang/phones/optional_silence.txt
--> data/lang/phones/optional_silence.txt is OK

Checking disambiguation symbols: #0 and #1
--> data/lang/phones/disambig.txt has "#0" and "#1"
--> data/lang/phones/disambig.txt is OK

Checking topo ...

Checking word-level disambiguation symbols...
--> data/lang/phones/wdisambig.txt exists (newer prepare_lang.sh)
Checking data/lang/oov.{txt, int} ...
--> text seems to be UTF-8 or ASCII, checking whitespaces
--> text contains only allowed whitespaces
--> 1 entry/entries in data/lang/oov.txt
--> data/lang/oov.int corresponds to data/lang/oov.txt
--> data/lang/oov.{txt, int} are OK

--> data/lang/L.fst is olabel sorted
--> data/lang/L_disambig.fst is olabel sorted
--> SUCCESS [validating lang directory data/lang]
sv@HP:~/lkaldi/egs/aishell/s5$ 
sv@HP:~/lkaldi/egs/aishell/s5$ # LM training
sv@HP:~/lkaldi/egs/aishell/s5$ local/aishell_train_lms.sh || exit 1;
Getting raw N-gram counts
discount_ngrams: for n-gram order 1, D=0.000000, tau=0.000000 phi=1.000000
discount_ngrams: for n-gram order 2, D=0.000000, tau=0.000000 phi=1.000000
discount_ngrams: for n-gram order 3, D=1.000000, tau=0.000000 phi=1.000000
Iteration 1/6 of optimizing discounting parameters
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.675000 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.800000, tau=0.675000 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=0.825000 phi=2.000000
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.900000 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.800000, tau=0.900000 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.100000 phi=2.000000
discount_ngrams: for n-gram order 1, D=0.600000, tau=1.215000 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.800000, tau=1.215000 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.485000 phi=2.000000
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
Perplexity over 99496.000000 words is 573.088187
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 573.088187
Perplexity over 99496.000000 words is 571.430399
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 571.430399

real	0m2.165s
user	0m2.870s
sys	0m0.100s

real	0m2.170s
user	0m2.861s
sys	0m0.064s
Perplexity over 99496.000000 words is 571.860357
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 571.860357

real	0m2.264s
user	0m2.922s
sys	0m0.047s
Projected perplexity change from setting alpha=-0.413521475380432 is 571.860357->571.350704659834, reduction of 0.509652340166213
Alpha value on iter 1 is -0.413521475380432
Iteration 2/6 of optimizing discounting parameters
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.800000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=0.483845 phi=2.000000
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.800000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=0.645126 phi=2.000000
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.800000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=0.870921 phi=2.000000
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
Perplexity over 99496.000000 words is 570.909914
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 570.909914

real	0m2.152s
user	0m2.881s
sys	0m0.069s
Perplexity over 99496.000000 words is 570.548231
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 570.548231
Perplexity over 99496.000000 words is 570.209333
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 570.209333

real	0m2.166s
user	0m2.794s
sys	0m0.062s

real	0m2.168s
user	0m2.869s
sys	0m0.081s
optimize_alpha.pl: alpha=0.782133003937562 is too positive, limiting it to 0.7
Projected perplexity change from setting alpha=0.7 is 570.548231->570.0658029, reduction of 0.482428099999765
Alpha value on iter 2 is 0.7
Iteration 3/6 of optimizing discounting parameters
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.800000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.750000
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.800000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=2.000000
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.800000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=2.350000
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
Perplexity over 99496.000000 words is 570.074175
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 570.074175

real	0m2.126s
user	0m2.789s
sys	0m0.121s
Perplexity over 99496.000000 words is 570.070852
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 570.070852

real	0m2.137s
user	0m2.750s
sys	0m0.065s
Perplexity over 99496.000000 words is 570.135232
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 570.135232

real	0m2.215s
user	0m2.898s
sys	0m0.081s
Projected perplexity change from setting alpha=-0.149743638839048 is 570.074175->570.068152268062, reduction of 0.00602273193794645
Alpha value on iter 3 is -0.149743638839048
Iteration 4/6 of optimizing discounting parameters
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.850256
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.800000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.850256
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=1.080000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.850256
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
Perplexity over 99496.000000 words is 651.559076
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 651.559076

real	0m1.505s
user	0m1.853s
sys	0m0.075s
Perplexity over 99496.000000 words is 571.811721
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 571.811721
Perplexity over 99496.000000 words is 570.079098
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 570.079098

real	0m2.131s
user	0m2.738s
sys	0m0.097s

real	0m2.131s
user	0m2.754s
sys	0m0.091s
Projected perplexity change from setting alpha=-0.116327143544381 is 570.079098->564.672375993263, reduction of 5.40672200673657
Alpha value on iter 4 is -0.116327143544381
Iteration 5/6 of optimizing discounting parameters
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.706938, tau=0.395873 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.850256
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.706938, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.850256
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.706938, tau=0.712571 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.850256
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
Perplexity over 99496.000000 words is 567.231151
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 567.231151

real	0m2.130s
user	0m2.838s
sys	0m0.076s
Perplexity over 99496.000000 words is 567.407206
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 567.407206

real	0m2.158s
user	0m2.814s
sys	0m0.060s
Perplexity over 99496.000000 words is 567.980179
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 567.980179

real	0m2.255s
user	0m2.983s
sys	0m0.058s
Projected perplexity change from setting alpha=0.259356959958262 is 567.407206->567.206654822021, reduction of 0.20055117797915
Alpha value on iter 5 is 0.259356959958262
Iteration 6/6 of optimizing discounting parameters
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.706938, tau=0.664727 phi=1.750000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.850256
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.706938, tau=0.664727 phi=2.000000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.850256
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.706938, tau=0.664727 phi=2.350000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.850256
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
Perplexity over 99496.000000 words is 567.181130
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 567.181130

real	0m2.129s
user	0m2.812s
sys	0m0.080s
Perplexity over 99496.000000 words is 567.346876
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 567.346876

real	0m2.141s
user	0m2.747s
sys	0m0.114s
Perplexity over 99496.000000 words is 567.478625
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 567.478625

real	0m2.243s
user	0m2.949s
sys	0m0.076s
optimize_alpha.pl: alpha=2.83365708509299 is too positive, limiting it to 0.7
Projected perplexity change from setting alpha=0.7 is 567.346876->567.0372037, reduction of 0.309672299999761
Alpha value on iter 6 is 0.7
Final config is:
D=0.6 tau=0.527830672157611 phi=2
D=0.706938285164495 tau=0.664727230661135 phi=2.7
D=0 tau=1.09671484103859 phi=1.85025636116095
Discounting N-grams.
discount_ngrams: for n-gram order 1, D=0.600000, tau=0.527831 phi=2.000000
discount_ngrams: for n-gram order 2, D=0.706938, tau=0.664727 phi=2.700000
discount_ngrams: for n-gram order 3, D=0.000000, tau=1.096715 phi=1.850256
Computing final perplexity
Building ARPA LM (perplexity computation is in background)
interpolate_ngrams: 137074 words in wordslist
interpolate_ngrams: 137074 words in wordslist
Perplexity over 99496.000000 words is 567.320537
Perplexity over 99496.000000 words (excluding 0.000000 OOVs) is 567.320537
567.320537
Done training LM of type 3gram-mincount
sv@HP:~/lkaldi/egs/aishell/s5$ 
sv@HP:~/lkaldi/egs/aishell/s5$ # G compilation, check LG composition
sv@HP:~/lkaldi/egs/aishell/s5$ utils/format_lm.sh data/lang data/local/lm/3gram-mincount/lm_unpruned.gz \
>     data/local/dict/lexicon.txt data/lang_test || exit 1;
Converting 'data/local/lm/3gram-mincount/lm_unpruned.gz' to FST
arpa2fst --disambig-symbol=#0 --read-symbol-table=data/lang_test/words.txt - data/lang_test/G.fst 
LOG (arpa2fst[5.5.164~1-9698]:Read():arpa-file-parser.cc:94) Reading \data\ section.
LOG (arpa2fst[5.5.164~1-9698]:Read():arpa-file-parser.cc:149) Reading \1-grams: section.
LOG (arpa2fst[5.5.164~1-9698]:Read():arpa-file-parser.cc:149) Reading \2-grams: section.
LOG (arpa2fst[5.5.164~1-9698]:Read():arpa-file-parser.cc:149) Reading \3-grams: section.
LOG (arpa2fst[5.5.164~1-9698]:RemoveRedundantStates():arpa-lm-compiler.cc:359) Reduced num-states from 561655 to 102646
fstisstochastic data/lang_test/G.fst 
8.84583e-06 -0.56498
Succeeded in formatting LM: 'data/local/lm/3gram-mincount/lm_unpruned.gz'

第二部分:MFCC & CMVN

sv@HP:~/lkaldi/egs/aishell/s5$ mfccdir=mfcc
sv@HP:~/lkaldi/egs/aishell/s5$ # for x in train dev test; do
sv@HP:~/lkaldi/egs/aishell/s5$ 

Succeeded in formatting LM: 'data/local/lm/3gram-mincount/lm_unpruned.gz'
sv@HP:~/lkaldi/egs/aishell/s5$ mfccdir=mfcc
sv@HP:~/lkaldi/egs/aishell/s5$ # for x in train dev test; do
sv@HP:~/lkaldi/egs/aishell/s5$   steps/make_mfcc_pitch.sh --cmd "$train_cmd" --nj 2 data/train exp/make_mfcc/train $mfccdir;
steps/make_mfcc_pitch.sh --cmd run.pl --mem 8G --nj 2 data/train exp/make_mfcc/train mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/train
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for train
sv@HP:~/lkaldi/egs/aishell/s5$   steps/compute_cmvn_stats.sh data/train exp/make_mfcc/train $mfccdir || exit 1;
steps/compute_cmvn_stats.sh data/train exp/make_mfcc/train mfcc
Succeeded creating CMVN stats for train
sv@HP:~/lkaldi/egs/aishell/s5$   utils/fix_data_dir.sh data/train || exit 1;
fix_data_dir.sh: kept all 120098 utterances.
fix_data_dir.sh: old files are kept in data/train/.backup
sv@HP:~/lkaldi/egs/aishell/s5$ 
sv@HP:~/lkaldi/egs/aishell/s5$   steps/make_mfcc_pitch.sh --cmd "$train_cmd" --nj 10 data/dev exp/make_mfcc/dev $mfccdir || exit 1;
steps/make_mfcc_pitch.sh --cmd run.pl --mem 8G --nj 10 data/dev exp/make_mfcc/dev mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/dev
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for dev
sv@HP:~/lkaldi/egs/aishell/s5$   steps/compute_cmvn_stats.sh data/dev exp/make_mfcc/dev $mfccdir || exit 1;
steps/compute_cmvn_stats.sh data/dev exp/make_mfcc/dev mfcc
Succeeded creating CMVN stats for dev
sv@HP:~/lkaldi/egs/aishell/s5$   utils/fix_data_dir.sh data/dev || exit 1;
fix_data_dir.sh: kept all 14326 utterances.
fix_data_dir.sh: old files are kept in data/dev/.backup
sv@HP:~/lkaldi/egs/aishell/s5$ 
sv@HP:~/lkaldi/egs/aishell/s5$   steps/make_mfcc_pitch.sh --cmd "$train_cmd" --nj 10 data/test exp/make_mfcc/test $mfccdir || exit 1;
steps/make_mfcc_pitch.sh --cmd run.pl --mem 8G --nj 10 data/test exp/make_mfcc/test mfcc
utils/validate_data_dir.sh: Successfully validated data-directory data/test
steps/make_mfcc_pitch.sh: [info]: no segments file exists: assuming wav.scp indexed by utterance.
Succeeded creating MFCC & Pitch features for test
sv@HP:~/lkaldi/egs/aishell/s5$   steps/compute_cmvn_stats.sh data/test exp/make_mfcc/test $mfccdir || exit 1;
steps/compute_cmvn_stats.sh data/test exp/make_mfcc/test mfcc
Succeeded creating CMVN stats for test
sv@HP:~/lkaldi/egs/aishell/s5$   utils/fix_data_dir.sh data/test || exit 1;
fix_data_dir.sh: kept all 7176 utterances.
fix_data_dir.sh: old files are kept in data/test/.backup
sv@HP:~/lkaldi/egs/aishell/s5$ #done

第三部分:单音素

继续:Kaldi单步完美运行AIShell v1 S5之二:单音素MonoPhone 


继续:Kaldi单步完美运行AIShell v1 S5之五:chain DNN
继续:Kaldi单步完美运行AIShell v1 S5之四:nnet3 DNN
回头:Kaldi单步完美运行AIShell v1 S5之三:三音素TriPhone
回头:Kaldi单步完美运行AIShell v1 S5之二:单音素MonoPhone
回头:Kaldi单步完美运行AIShell v1 S5之一:MONO前

其他参考:Kaldi完美运行TIMIT完整结果(含DNN)

你可能感兴趣的:(Kaldi)