讨论:
https://www.jiqizhixin.com/articles/2017-06-28-5
https://ricardokleinklein.github.io/2017/11/16/Attention-is-all-you-need.html
1. Mutli GPU 和 Single 配置的区别
https://github.com/tensorflow/tensor2tensor/issues/124
https://github.com/tensorflow/tensor2tensor/issues/17
2. 用Multi GUP在跑相同step的速度比Single GPU慢
https://github.com/tensorflow/tensor2tensor/issues/146
https://github.com/tensorflow/tensor2tensor/issues/390
3. batch size参数
https://github.com/tensorflow/tensor2tensor/issues/17#issuecomment-310268149
https://github.com/tensorflow/tensor2tensor/issues/415#issue-273498229
4. 数据处理
4.1)https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/bin/t2t-datagen
generate_data_for_problem(problem)
4.2)https://github.com/tensorflow/tensor2tensor/blob/92983eaaa457ec18729b1883ba5ae4a6614bdcb5/tensor2tensor/data_generators/generator_utils.py
generate_files(generator,output_filenames,max_cases=None)
"""Generate cases from a generator and save as TFRecord files.
Generated cases are transformed to tf.Example protos and saved as TFRecords in sharded files named output_dir/output_name-00..N-of-00..M=num_shards.
Args: generator: a generator yielding (string -> int/float/str list) dictionaries. output_filenames: List of output file paths. max_cases: maximum number of cases to get from the generator; if None (default), we use the generator until StopIteration is raised. """
注意:
writers[shard].write(sequence_example.SerializeToString()) 序列化数据集
4.3)
https://github.com/tensorflow/tensor2tensor/blob/92983eaaa457ec18729b1883ba5ae4a6614bdcb5/tensor2tensor/data_generators/generator_utils.py
get_or_generate_vocab(data_dir, tmp_dir, vocab_filename, vocab_size, sources)
get_or_generate_vocab_inner(data_dir, vocab_filename, vocab_size, generator)
"""Inner implementation for vocab generators.
Args:
data_dir: The base directory where data and vocab files are stored. If None, then do not save the vocab even if it doesn't exist.
vocab_filename: relative filename where vocab file is stored
vocab_size: target size of the vocabulary constructed by SubwordTextEncoder
generator: a generator that produces tokens from the vocabulary
Returns: A SubwordTextEncoder vocabulary object. """
vocab = text_encoder.SubwordTextEncoder.build_to_target_size( vocab_size, token_counts, 1, 1e3)
4.4)https://github.com/tensorflow/tensor2tensor/blob/e3cd447aa605515753ebfc3dbf1a4d4c5ae32425/tensor2tensor/data_generators/text_encoder.py
build_to_target_size(cls, target_size, token_counts, min_val, max_val, num_iterations=4)
"""Builds a SubwordTextEncoder that has `vocab_size` near `target_size`.
Uses simple recursive binary search to find a minimum token count that most closely matches the `target_size`.
Args: target_size: Desired vocab_size to approximate.
token_counts: A dictionary of token counts, mapping string to int.
min_val: An integer; lower bound for the minimum token count.
max_val: An integer; upper bound for the minimum token count.
num_iterations: An integer; how many iterations of refinement. Returns: A SubwordTextEncoder instance.
Raises: ValueError: If `min_val` is greater than `max_val`. """
# We build iteratively. On each iteration, we segment all the words, # then count the resulting potential subtokens, keeping the ones # with high enough counts for our new vocabulary.
5. 训练
https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/layers/common_hparams.py
6. Server
6.1 https://research.googleblog.com/2017/11/latest-innovations-in-tensorflow-serving.html
6.2 https://towardsdatascience.com/how-to-deploy-machine-learning-models-with-tensorflow-part-1-make-your-model-ready-for-serving-776a14ec3198
6.3 http://blog.csdn.net/wangjian1204/article/details/68928656
6.4 https://weiminwang.blog/2017/09/12/introductory-guide-to-tensorflow-serving/
6.5 https://github.com/tensorflow/tensor2tensor/issues/368
6.6 https://github.com/tensorflow/tensor2tensor/issues/349
1)跑Big model崩溃了
tensorflow.python.framework.errors_impl.InvalidArgumentError: Number of ways to split should evenly divide the split dimension, but got split_dim 0 (size = 4) and num_split 3
Caused by op u'transformer/split', defined at: ...
参考:https://github.com/tensorflow/tensor2tensor/issues/266
直接把batch size开小解决了问题。
2)文件取名
newsdev2017-zhen-src.pre.bpe.zh 这个名字t2t会认为是tar文件,报错:tarfile.ReadError: file could not be opened successfully