1. Get Tacotron-2-master.zip from https://github.com/Rayhane-mamah/Tacotron-2
2.Unzip Tacotron-2-master.zip on Unbuntu
3.Terminal: cp -r training_data ./Tacotron-2 #training_data is folder which was preparing by LJSpeech-1.1 & dataset
4.Terminal: python train.py --model='Tacotron-2':
CancelledError (see above for traceback): Enqueue operation was cancelled
[[Node: datafeeder/eval_queue_enqueue = QueueEnqueueV2[Tcomponents=[DT_INT32, DT_INT32, DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_INT32, DT_INT32], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/device:CPU:0"](datafeeder/eval_queue, _arg_datafeeder/inputs_0_1, _arg_datafeeder/input_lengths_0_0, _arg_datafeeder/mel_targets_0_3, _arg_datafeeder/token_targets_0_6, _arg_datafeeder/linear_targets_0_2, _arg_datafeeder/targets_lengths_0_5, _arg_datafeeder/split_infos_0_4)]]
Traceback (most recent call last):
File "train.py", line 138, in
main()
File "train.py", line 132, in main
train(args, log_dir, hparams)
File "train.py", line 57, in train
raise('Error occured while training Tacotron, Exiting!')
TypeError: exceptions must derive from BaseException
Maybe this wrong is caused by gpu collide,
Write code:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "6, 7"
Then it can be training.(step1, 2, 3......)
(Befor it , sys need to get conda env including requirment. It's not a easy thing. Here need to add)
Then the time of trainnint this batch(32) is 4.5 sec, although use two gpus, but the memory seems to only use the forst one.
(Need to test gpu_nums = 4, and use 4gpus; And set different batch_size, and training steps)
###from now is test
File:train.py
'5, 6'
File:hyparam.py
tacotron_num_gpus = 2
tacotron_batch_size = 32 * 2
parser.add_argument('--summary_interval', type=int, default=250,
help='Steps between running summary ops')
parser.add_argument('--embedding_interval', type=int, default=5000,
help='Steps between updating embeddings projection visualization')
parser.add_argument('--checkpoint_interval', type=int, default=2500,
help='Steps between writing checkpoints')
parser.add_argument('--eval_interval', type=int, default=5000,
help='Steps between eval on test data')
change =>
parser.add_argument('--summary_interval', type=int, default=1000,
help='Steps between running summary ops')
parser.add_argument('--embedding_interval', type=int, default=5000,
help='Steps between updating embeddings projection visualization')
parser.add_argument('--checkpoint_interval', type=int, default=1000,
help='Steps between writing checkpoints')
parser.add_argument('--eval_interval', type=int, default=5000,
help='Steps between eval on test data')
Train successfully, but batch_size = 64 make the single step's time = 5, don't know wether this is good for train both 10w steps compared with batch_size = 32, single gpu.
###end here
Now let's just wait.
After steps 25000, suddenly, it occurs the error: data feeder...
Just reuse terminal python train --model=' Tacotron-2', then it can work.
Don't know the reason, maybe test later. But for now, if it happens again, decrease steps between the save model by half.
Happened again, decrease to 250 steps. Don't know why, maybe is gpu_nums problem. And again, find gpus were used, maybe because this. Error is:
change GPUs from '5, 6' => '3, 4'
But still error:
改为默认的gpu_num = 1, batch_size = 32, "6"
5. ToDo. Error cause training stoped happens every day, needs to write sh to restart and need to use the free GPUs.
6. Tacotron-2-log/wav 's waves are better then eval/wav's, read code to see why. Maybe the text unseen in the training data. All of this is no teacher forced. Finished: wavs is in training set, eval/waves is in testing set, if all is teacherforced, then just for avoiding overfitting.
7. Liang dada's picture paper about teacher forced rate. Finished: teacher_forcing-mode in haparm.py, but not know it's details.
8. optimize
tensorflow.python.framework.errors_impl.InternalError: Failed to create session.
maybe out of GPU memory? Try running with CUDA_VISIBLE_DEVICES=''
speaker分为0, 1, 2, 3是怎么回事, 数据集明明没有多说话人.
ValueError: Defined synthesis batch size 1 is smaller than minimum required 2 (num_gpus)! Please verify your synthesis batch size choice.
num_gpu = 1
CUDA_VISIBLE_DEVICES='' python synthesize.py --model='Tacotron-2' --mode='live' 或者增加GPU
From https://github.com/awesome-archive/tacotron_cn down zip
conda tf1.10-pt1.10
Then pip install pypinyin
Try bash train.sh, but have not set the right path and training_data, so just stop now.
Wait for Xingchen's successful version.
1. "nvcc -V" to see cuda vesion
2. install pytorch1.2
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch !!!!!!!!!Always Time out.
3. git clone https://github.com/NVIDIA/tacotron2.git
4.mv tacotron Pt-Tacotron
5.git submodule init; git submodule update
6.sed -i -- 's,DUMMY,/home/data/LJSpeech-1.1/wavs,g' filelists/*.txt