warrioR_wx

TensorFlow Seq2Seq Model笔记

0. tf跑起来一直没有用GPU...

尴尬，跑起来发现GPU没用起来，CPU满了。发现装错了，应该装tensorflow-gpu。

代码测试是否用的是GPU：https://stackoverflow.com/questions/38009682/how-to-tell-if-tensorflow-is-using-gpu-acceleration-from-inside-python-shell

1. tf.app.run()的疑惑

http://stackoverflow.com/questions/33703624/how-does-tf-app-run-work

tf.app类似python中argparse

2. variable scope 和 name scope

Variable Scope mechanism: https://www.tensorflow.org/programmers_guide/variable_scope

http://stackoverflow.com/questions/35919020/whats-the-difference-of-name-scope-and-a-variable-scope-in-tensorflow

重点：Name scopes can be opened in addition to a variable scope, and then they will only affect the names of the ops, but not of variables.

with tf.variable_scope("foo"):
    with tf.name_scope("bar"):
        v = tf.get_variable("v", [1])
        x = 1.0 + v
assert v.name == "foo/v:0"
assert x.op.name == "foo/bar/add"

scope.original_name_scope和scope.name的区别

http://stackoverflow.com/questions/41756054/tensorflow-variablescope-original-name-scope-vs-name

3. Python2 Python3区别
在修改data_utils.py（https://github.com/tensorflow/models/blob/master/tutorials/rnn/translate/data_utils.py）文件中：
没注意版本不同的区别，Python3语法中print语句没有了，取而代之的是print()函数。
另外用python2执行时候：
with gfile.GFile(data_path, mode="rb") as f:
counter = 0
for line in f:

这里line里面含有'\n'，用split切分后会和最后一个word组合一起读入list。出现list写到文件中和len（list）大小不一致。
比如 li=['a', 'b\n'] 写入文件。'\n'会换行，写的文件成为3行。
下次逐行读入时候会把空符号（‘’）计算为一个新word。

4. TensorFlow Saver类 https://www.tensorflow.org/api_docs/python/tf/train/Saver

http://blog.csdn.net/u011500062/article/details/51728830

5. Seq2Seq模型保存

其中保存的模型为：

translate.ckpt-16.data-00000-of-00001

translate.ckpt-16.index

translate.ckpt-16.meta

这些东西的解释见：

https://groups.google.com/a/tensorflow.org/forum/#!topic/discuss/Y4mzbDAUSec

http://stackoverflow.com/questions/36195454/what-is-the-tensorflow-checkpoint-meta-file

6. RNN示例

https://uqer.io/community/share/58a9332bf1973300597ae209

http://r2rt.com/recurrent-neural-networks-in-tensorflow-ii.html

7. List of tensor to tensor

http://stackoverflow.com/questions/35730161/how-to-convert-a-list-of-tensors-of-dim-n-to-a-tensor-of-dim-n1

http://blog.csdn.net/sherry_up/article/details/52169318

8. batch_matmul问题

想进行的操作： suppose I have a T x n x k and want to multiply it by a k x k2, and then to a max pool overT and then a mean pool over n. To do this now, I think you need to reshape, do the matmul() and then undo the reshape and then do the pooling.

https://github.com/tensorflow/tensorflow/issues/216

https://www.tensorflow.org/versions/r0.10/api_docs/python/math_ops/matrix_math_functions#batch_matmul

使用时候报错：AttributeError: 'module' object has no attribute 'batch_matmul'，才发现1.0版本中没有这个。需要用matmul加参数进行使用

9. Cannot feed value of shape (XX) for Tensor u'target/Y:0', which has shape '(YY)'？

第一次遇到这种问题，google后说是input feed的数据shape不一致。但出问题是第五个变量（mask5），导致自己以为不是input feed问题（如果是为什么会是第5个才出问题？）。瞎折腾好久后，发现还是输入数据时候的问题.....

10. 读取现有模型

之前一直可以读取指定目录下现有模型，后来发现读不了，折腾了几个小时才发现以前下面是有一个checkpoint文件，里面会告诉模型两个path：

model_checkpoint_path: "translate.ckpt-101000"
all_model_checkpoint_paths: "translate.ckpt-101000"

11. 读模型内的参数值

https://www.tensorflow.org/programmers_guide/variables#checkpoint_files：

When you create a Saver object, you can optionally choose names for the variables in the checkpoint files. By default, it uses the value of the tf.Variable.name property for each variable.

To understand what variables are in a checkpoint, you can use the inspect_checkpoint library, and in particular, the print_tensors_in_checkpoint_file function.

*https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/tools/inspect_checkpoint.py

用法：

python inspect_checkpoint.py --file_name=./alpha_easy_nmt/valid_model/translate.ckpt-625000tensor

显示：

Decoder/trg_lookup_table/embedding (DT_FLOAT) [16000,620]
Decoder/trg_lookup_table/embedding/Adadelta (DT_FLOAT) [16000,620]
Decoder/trg_lookup_table/embedding/Adadelta_1 (DT_FLOAT) [16000,620]

用法：

python inspect_checkpoint.py --file_name=./alpha_easy_nmt/valid_model/translate.ckpt-625000 --tensor_name=Decoder/W_sf

显示：

tensor_name: Decoder/W_sf
[[ -4.55709170e-07 -9.10816539e-07 4.44753543e-02 ..., -2.58049741e-02
4.26506670e-03 -3.64431571e-07]
[ 7.86067460e-07 7.86348721e-07 1.29140466e-02 ..., 7.92008177e-06
5.49392325e-07 6.99410566e-06]
[ -5.86683996e-07 5.51591484e-08 9.70983803e-02 ..., 2.75615434e-07
-4.86231060e-04 1.23817983e-07]
...,
[ -1.40239194e-06 -1.00237912e-06 -1.44313052e-01 ..., -1.33047411e-06
-1.17946070e-06 -2.41477892e-07]
[ 1.19242941e-06 -9.48488719e-08 -2.48298571e-02 ..., 1.00101170e-03
-3.03782895e-03 1.45507602e-06]
[ -1.27071712e-06 -1.27975386e-06 -2.31240150e-02 ..., -7.33333752e-02
2.30671745e-03 -5.72958811e-07]]

12. tf.get_variable的default initializer

https://www.tensorflow.org/api_docs/python/tf/get_variable：

If initializer is None (the default), the default initializer passed in the variable scope will be used. If that one is None too, a glorot_uniform_initializer will be used. The initializer can also be a Tensor, in which case the variable is initialized to this value and shape.

奇怪是glorot_uniform_initializer也差不到任何文档提及，github上倒是有人问过这个问题，不过没人回答https://github.com/tensorflow/tensorflow/issues/7791。

13. 系统记录

13.1 之前系统在NIST06上BLEU到20就停住了（theano版本和我师弟的版本都能到34）。beam search输出每一个beam，发现特别多的over translation问题。找了下发现自己系统出了一个小Bug，在beam search中忘记给source annotation加上mask。

Vocab = 16k，Batch = 50，两种优化算法每隔3000batch测一次BLEU。

Adam:

72000   BLEU score = 0.2412     BEST BLEU is 0
75000   BLEU score = 0.2377     BEST BLEU is 0.2412
78000   BLEU score = 0.2380     BEST BLEU is 0.2412
81000   BLEU score = 0.2513     BEST BLEU is 0.2412
84000   BLEU score = 0.2231     BEST BLEU is 0.2513
87000   BLEU score = 0.2527     BEST BLEU is 0.2513
90000   BLEU score = 0.2314     BEST BLEU is 0.2527
93000   BLEU score = 0.2498     BEST BLEU is 0.2527
96000   BLEU score = 0.2445     BEST BLEU is 0.2527
99000   BLEU score = 0.2487     BEST BLEU is 0.2527
102000  BLEU score = 0.2497     BEST BLEU is 0.2527
105000  BLEU score = 0.2523     BEST BLEU is 0.2527
108000  BLEU score = 0.2375     BEST BLEU is 0.2527
111000  BLEU score = 0.2380     BEST BLEU is 0.2527
114000  BLEU score = 0.2457     BEST BLEU is 0.2527
117000  BLEU score = 0.2525     BEST BLEU is 0.2527
120000  BLEU score = 0.2519     BEST BLEU is 0.2527
123000  BLEU score = 0.2491     BEST BLEU is 0.2527
126000  BLEU score = 0.2391     BEST BLEU is 0.2527
129000  BLEU score = 0.2304     BEST BLEU is 0.2527
132000  BLEU score = 0.2618     BEST BLEU is 0.2527
135000  BLEU score = 0.2489     BEST BLEU is 0.2618
138000  BLEU score = 0.2458     BEST BLEU is 0.2618
141000  BLEU score = 0.2505     BEST BLEU is 0.2618
144000  BLEU score = 0.2558     BEST BLEU is 0.2618
147000  BLEU score = 0.2492     BEST BLEU is 0.2618
150000  BLEU score = 0.2463     BEST BLEU is 0.2618
153000  BLEU score = 0.2586     BEST BLEU is 0.2618
156000  BLEU score = 0.2495     BEST BLEU is 0.2618
159000  BLEU score = 0.2568     BEST BLEU is 0.2618
162000  BLEU score = 0.2571     BEST BLEU is 0.2618
165000  BLEU score = 0.2611     BEST BLEU is 0.2618
168000  BLEU score = 0.2508     BEST BLEU is 0.2618
171000  BLEU score = 0.2450     BEST BLEU is 0.2618
174000  BLEU score = 0.2459     BEST BLEU is 0.2618
177000  BLEU score = 0.2579     BEST BLEU is 0.2618
180000  BLEU score = 0.2580     BEST BLEU is 0.2618
183000  BLEU score = 0.2520     BEST BLEU is 0.2618
186000  BLEU score = 0.2730     BEST BLEU is 0.2618
189000  BLEU score = 0.2430     BEST BLEU is 0.273
192000  BLEU score = 0.2571     BEST BLEU is 0.273
195000  BLEU score = 0.2541     BEST BLEU is 0.273
198000  BLEU score = 0.2471     BEST BLEU is 0.273
201000  BLEU score = 0.2491     BEST BLEU is 0.273
204000  BLEU score = 0.2589     BEST BLEU is 0.273
207000  BLEU score = 0.2523     BEST BLEU is 0.273
210000  BLEU score = 0.2536     BEST BLEU is 0.273
213000  BLEU score = 0.2557     BEST BLEU is 0.273
216000  BLEU score = 0.2457     BEST BLEU is 0.273
219000  BLEU score = 0.2661     BEST BLEU is 0.273
222000  BLEU score = 0.2515     BEST BLEU is 0.273
225000  BLEU score = 0.2644     BEST BLEU is 0.273
228000  BLEU score = 0.2616     BEST BLEU is 0.273
231000  BLEU score = 0.2554     BEST BLEU is 0.273
234000  BLEU score = 0.2621     BEST BLEU is 0.273
237000  BLEU score = 0.2519     BEST BLEU is 0.273
240000  BLEU score = 0.2440     BEST BLEU is 0.273
243000  BLEU score = 0.2572     BEST BLEU is 0.273
246000  BLEU score = 0.2488     BEST BLEU is 0.273
249000  BLEU score = 0.2631     BEST BLEU is 0.273
252000  BLEU score = 0.2584     BEST BLEU is 0.273
255000  BLEU score = 0.2570     BEST BLEU is 0.273
258000  BLEU score = 0.2581     BEST BLEU is 0.273
261000  BLEU score = 0.2510     BEST BLEU is 0.273
264000  BLEU score = 0.2476     BEST BLEU is 0.273
267000  BLEU score = 0.2667     BEST BLEU is 0.273
270000  BLEU score = 0.2689     BEST BLEU is 0.273
273000  BLEU score = 0.2596     BEST BLEU is 0.273
276000  BLEU score = 0.2592     BEST BLEU is 0.273
279000  BLEU score = 0.2617     BEST BLEU is 0.273
282000  BLEU score = 0.2652     BEST BLEU is 0.273
285000  BLEU score = 0.2651     BEST BLEU is 0.273
288000  BLEU score = 0.2732     BEST BLEU is 0.273
291000  BLEU score = 0.2505     BEST BLEU is 0.2732
294000  BLEU score = 0.2545     BEST BLEU is 0.2732
297000  BLEU score = 0.2737     BEST BLEU is 0.2732
300000  BLEU score = 0.2662     BEST BLEU is 0.2737

Adadelta：

72000   BLEU score = 0.1732     BEST BLEU is 0
75000   BLEU score = 0.1752     BEST BLEU is 0.1732
78000   BLEU score = 0.1888     BEST BLEU is 0.1752
81000   BLEU score = 0.1771     BEST BLEU is 0.1888
84000   BLEU score = 0.1876     BEST BLEU is 0.1888
87000   BLEU score = 0.1968     BEST BLEU is 0.1888
90000   BLEU score = 0.1664     BEST BLEU is 0.1968
93000   BLEU score = 0.2059     BEST BLEU is 0.1968
96000   BLEU score = 0.1816     BEST BLEU is 0.2059
99000   BLEU score = 0.2098     BEST BLEU is 0.2059
102000  BLEU score = 0.2086     BEST BLEU is 0.2098
105000  BLEU score = 0.2029     BEST BLEU is 0.2098
108000  BLEU score = 0.2222     BEST BLEU is 0.2098
111000  BLEU score = 0.1929     BEST BLEU is 0.2222
114000  BLEU score = 0.1951     BEST BLEU is 0.2222
117000  BLEU score = 0.2212     BEST BLEU is 0.2222
120000  BLEU score = 0.2111     BEST BLEU is 0.2222
123000  BLEU score = 0.1981     BEST BLEU is 0.2222
126000  BLEU score = 0.2054     BEST BLEU is 0.2222
129000  BLEU score = 0.2228     BEST BLEU is 0.2222
132000  BLEU score = 0.2250     BEST BLEU is 0.2228
135000  BLEU score = 0.2061     BEST BLEU is 0.225
138000  BLEU score = 0.2333     BEST BLEU is 0.225
141000  BLEU score = 0.2236     BEST BLEU is 0.2333
144000  BLEU score = 0.2123     BEST BLEU is 0.2333
147000  BLEU score = 0.2242     BEST BLEU is 0.2333
150000  BLEU score = 0.2120     BEST BLEU is 0.2333
153000  BLEU score = 0.2404     BEST BLEU is 0.2333
156000  BLEU score = 0.2348     BEST BLEU is 0.2404
159000  BLEU score = 0.2195     BEST BLEU is 0.2404
162000  BLEU score = 0.2383     BEST BLEU is 0.2404
165000  BLEU score = 0.2192     BEST BLEU is 0.2404
168000  BLEU score = 0.2240     BEST BLEU is 0.2404
171000  BLEU score = 0.2265     BEST BLEU is 0.2404
174000  BLEU score = 0.2211     BEST BLEU is 0.2404
177000  BLEU score = 0.2302     BEST BLEU is 0.2404
180000  BLEU score = 0.2360     BEST BLEU is 0.2404
183000  BLEU score = 0.2161     BEST BLEU is 0.2404
186000  BLEU score = 0.2316     BEST BLEU is 0.2404
189000  BLEU score = 0.2298     BEST BLEU is 0.2404
192000  BLEU score = 0.2316     BEST BLEU is 0.2404
195000  BLEU score = 0.2166     BEST BLEU is 0.2404
198000  BLEU score = 0.2350     BEST BLEU is 0.2404
201000  BLEU score = 0.2295     BEST BLEU is 0.2404
204000  BLEU score = 0.2456     BEST BLEU is 0.2404
207000  BLEU score = 0.2392     BEST BLEU is 0.2456
210000  BLEU score = 0.2311     BEST BLEU is 0.2456
213000  BLEU score = 0.2113     BEST BLEU is 0.2456
216000  BLEU score = 0.2223     BEST BLEU is 0.2456
219000  BLEU score = 0.2258     BEST BLEU is 0.2456
222000  BLEU score = 0.2304     BEST BLEU is 0.2456
225000  BLEU score = 0.2165     BEST BLEU is 0.2456
228000  BLEU score = 0.2336     BEST BLEU is 0.2456
231000  BLEU score = 0.2345     BEST BLEU is 0.2456
234000  BLEU score = 0.2444     BEST BLEU is 0.2456
237000  BLEU score = 0.2310     BEST BLEU is 0.2456
240000  BLEU score = 0.2406     BEST BLEU is 0.2456
243000  BLEU score = 0.2294     BEST BLEU is 0.2456
246000  BLEU score = 0.2469     BEST BLEU is 0.2456
249000  BLEU score = 0.2479     BEST BLEU is 0.2469
252000  BLEU score = 0.2464     BEST BLEU is 0.2479
255000  BLEU score = 0.2490     BEST BLEU is 0.2479
258000  BLEU score = 0.2406     BEST BLEU is 0.249
261000  BLEU score = 0.2477     BEST BLEU is 0.249
264000  BLEU score = 0.2392     BEST BLEU is 0.249
267000  BLEU score = 0.2516     BEST BLEU is 0.249
270000  BLEU score = 0.2521     BEST BLEU is 0.2516
273000  BLEU score = 0.2370     BEST BLEU is 0.2521
276000  BLEU score = 0.2431     BEST BLEU is 0.2521
279000  BLEU score = 0.2548     BEST BLEU is 0.2521
282000  BLEU score = 0.2605     BEST BLEU is 0.2548
285000  BLEU score = 0.2421     BEST BLEU is 0.2605
288000  BLEU score = 0.2446     BEST BLEU is 0.2605
291000  BLEU score = 0.2521     BEST BLEU is 0.2605
294000  BLEU score = 0.2529     BEST BLEU is 0.2605
297000  BLEU score = 0.2453     BEST BLEU is 0.2605
300000  BLEU score = 0.2361     BEST BLEU is 0.2605

13.2 系统性能还是没有我师弟的高。继续检查发现初始化init_context变量有问题。我为了图方便，用最后一个source annotation作为init context，我师弟和之前theano版本用的是mean pooling方法。Nematus: a Toolkit for Neural Machine Translation也提到了这个问题：We initialize the decoder hidden state with the mean of the source annotation, rather than the annotation at the last position of the encoder backward RNN. 似乎我那种做法是有问题的。

13.3

14. Dropout问题

http://blog.csdn.net/wangxinginnlp/article/details/72649820

*. GRU+Embedding的追溯

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/python/ops/rnn_cell_impl.py
class _RNNCell(object):
  """Abstract object representing an RNN cell.
  Every `RNNCell` must have the properties below and implement `__call__` with
  the following signature.
  This definition of cell differs from the definition used in the literature.
  In the literature, 'cell' refers to an object with a single scalar output.
  This definition refers to a horizontal array of such units.
  An RNN cell, in the most abstract setting, is anything that has
  a state and performs some operation that takes a matrix of inputs.
  This operation results in an output matrix with `self.output_size` columns.
  If `self.state_size` is an integer, this operation also results in a new
  state matrix with `self.state_size` columns.  If `self.state_size` is a
  tuple of integers, then it results in a tuple of `len(state_size)` state
  matrices, each with a column size corresponding to values in `state_size`.
  """

  def __call__(self, inputs, state, scope=None):
    """Run this RNN cell on inputs, starting from the given state.
    Args:
      inputs: `2-D` tensor with shape `[batch_size x input_size]`.
      state: if `self.state_size` is an integer, this should be a `2-D Tensor`
        with shape `[batch_size x self.state_size]`.  Otherwise, if
        `self.state_size` is a tuple of integers, this should be a tuple
        with shapes `[batch_size x s] for s in self.state_size`.
      scope: VariableScope for the created subgraph; defaults to class name.
    Returns:
      A pair containing:
      - Output: A `2-D` tensor with shape `[batch_size x self.output_size]`.
      - New state: Either a single `2-D` tensor, or a tuple of tensors matching
        the arity and shapes of `state`.
    """
    raise NotImplementedError("Abstract method")

  @property
  def state_size(self):
    """size(s) of state(s) used by this cell.
    It can be represented by an Integer, a TensorShape or a tuple of Integers
    or TensorShapes.
    """
    raise NotImplementedError("Abstract method")

  @property
  def output_size(self):
    """Integer or TensorShape: size of outputs produced by this cell."""
    raise NotImplementedError("Abstract method")

  def zero_state(self, batch_size, dtype):
    """Return zero-filled state tensor(s).
    Args:
      batch_size: int, float, or unit Tensor representing the batch size.
      dtype: the data type to use for the state.
    Returns:
      If `state_size` is an int or TensorShape, then the return value is a
      `N-D` tensor of shape `[batch_size x state_size]` filled with zeros.
      If `state_size` is a nested list or tuple, then the return value is
      a nested list or tuple (of the same structure) of `2-D` tensors with
      the shapes `[batch_size x s]` for each s in `state_size`.
    """
    with ops.name_scope(type(self).__name__ + "ZeroState", values=[batch_size]):
      state_size = self.state_size
      return _zero_state_tensors(state_size, batch_size, dtype)



https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py

_BIAS_VARIABLE_NAME = "biases"
_WEIGHTS_VARIABLE_NAME = "weights"

'''
matmul(): ultiplies matrix `a` by matrix `b`, producing `a` * `b`.
concat(): Concatenates tensors along one dimension.
'''

def _linear(args, output_size, bias, bias_start=0.0):
  """Linear map: sum_i(args[i] * W[i]), where W[i] is a variable.
  Args:
    args: a 2D Tensor or a list of 2D, batch x n, Tensors.
    output_size: int, second dimension of W[i].
    bias: boolean, whether to add a bias term or not.
    bias_start: starting value to initialize the bias; 0 by default.
  Returns:
    A 2D Tensor with shape [batch x output_size] equal to
    sum_i(args[i] * W[i]), where W[i]s are newly created matrices.
  Raises:
    ValueError: if some of the arguments has unspecified or wrong shape.
  """
  if args is None or (nest.is_sequence(args) and not args):
    raise ValueError("`args` must be specified")
  if not nest.is_sequence(args):
    args = [args]

  # Calculate the total size of arguments on dimension 1.
  total_arg_size = 0
  shapes = [a.get_shape() for a in args]
  for shape in shapes:
    if shape.ndims != 2:
      raise ValueError("linear is expecting 2D arguments: %s" % shapes)
    if shape[1].value is None:
      raise ValueError("linear expects shape[1] to be provided for shape %s, "
                       "but saw %s" % (shape, shape[1]))
    else:
      total_arg_size += shape[1].value

  dtype = [a.dtype for a in args][0]

  # Now the computation.
  scope = vs.get_variable_scope()
  with vs.variable_scope(scope) as outer_scope:
    weights = vs.get_variable(
        _WEIGHTS_VARIABLE_NAME, [total_arg_size, output_size], dtype=dtype)
    if len(args) == 1:
      res = math_ops.matmul(args[0], weights)
    else:
      res = math_ops.matmul(array_ops.concat(args, 1), weights)
    if not bias:
      return res
    with vs.variable_scope(outer_scope) as inner_scope:
      inner_scope.set_partitioner(None)
      biases = vs.get_variable(
          _BIAS_VARIABLE_NAME, [output_size],
          dtype=dtype,
          initializer=init_ops.constant_initializer(bias_start, dtype=dtype))
    return nn_ops.bias_add(res, biases)




class GRUCell(RNNCell):
  """Gated Recurrent Unit cell (cf. http://arxiv.org/abs/1406.1078)."""

  def __init__(self, num_units, input_size=None, activation=tanh, reuse=None):
    if input_size is not None:
      logging.warn("%s: The input_size parameter is deprecated.", self)
    self._num_units = num_units
    self._activation = activation
    self._reuse = reuse

  @property
  def state_size(self):
    return self._num_units

  @property
  def output_size(self):
    return self._num_units

  def __call__(self, inputs, state, scope=None):
    """Gated recurrent unit (GRU) with nunits cells."""
    with _checked_scope(self, scope or "gru_cell", reuse=self._reuse):
      with vs.variable_scope("gates"):  # Reset gate and update gate.
        # We start with bias of 1.0 to not reset and not update.
        value = sigmoid(_linear(
          [inputs, state], 2 * self._num_units, True, 1.0))
        r, u = array_ops.split(
            value=value,
            num_or_size_splits=2,
            axis=1)
      with vs.variable_scope("candidate"):
        c = self._activation(_linear([inputs, r * state],
                                     self._num_units, True))
      new_h = u * state + (1 - u) * c
    return new_h, new_h
	
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py
class EmbeddingWrapper(RNNCell):
  """Operator adding input embedding to the given cell.
  Note: in many cases it may be more efficient to not use this wrapper,
  but instead concatenate the whole sequence of your inputs in time,
  do the embedding on this batch-concatenated sequence, then split it and
  feed into your RNN.
  """

  def __init__(self, cell, embedding_classes, embedding_size, initializer=None,
               reuse=None):
    """Create a cell with an added input embedding.
    Args:
      cell: an RNNCell, an embedding will be put before its inputs.
      embedding_classes: integer, how many symbols will be embedded.
      embedding_size: integer, the size of the vectors we embed into.
      initializer: an initializer to use when creating the embedding;
        if None, the initializer from variable scope or a default one is used.
      reuse: (optional) Python boolean describing whether to reuse variables
        in an existing scope.  If not `True`, and the existing scope already has
        the given variables, an error is raised.
    Raises:
      TypeError: if cell is not an RNNCell.
      ValueError: if embedding_classes is not positive.
    """
    if not isinstance(cell, RNNCell):
      raise TypeError("The parameter cell is not RNNCell.")
    if embedding_classes <= 0 or embedding_size <= 0:
      raise ValueError("Both embedding_classes and embedding_size must be > 0: "
                       "%d, %d." % (embedding_classes, embedding_size))
    self._cell = cell
    self._embedding_classes = embedding_classes
    self._embedding_size = embedding_size
    self._initializer = initializer
    self._reuse = reuse

  @property
  def state_size(self):
    return self._cell.state_size

  @property
  def output_size(self):
    return self._cell.output_size

  def zero_state(self, batch_size, dtype):
    with ops.name_scope(type(self).__name__ + "ZeroState", values=[batch_size]):
      return self._cell.zero_state(batch_size, dtype)

  def __call__(self, inputs, state, scope=None):
    """Run the cell on embedded inputs."""
    with _checked_scope(self, scope or "embedding_wrapper", reuse=self._reuse):
      with ops.device("/cpu:0"):
        if self._initializer:
          initializer = self._initializer
        elif vs.get_variable_scope().initializer:
          initializer = vs.get_variable_scope().initializer
        else:
          # Default initializer for embeddings should have variance=1.
          sqrt3 = math.sqrt(3)  # Uniform(-sqrt(3), sqrt(3)) has variance=1.
          initializer = init_ops.random_uniform_initializer(-sqrt3, sqrt3)

        if type(state) is tuple:
          data_type = state[0].dtype
        else:
          data_type = state.dtype

        embedding = vs.get_variable(
            "embedding", [self._embedding_classes, self._embedding_size],
            initializer=initializer,
            dtype=data_type)
        embedded = embedding_ops.embedding_lookup(
            embedding, array_ops.reshape(inputs, [-1]))
    return self._cell(embedded, state)

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/rnn/python/ops/core_rnn.py

def static_rnn(cell, inputs, initial_state=None, dtype=None,
               sequence_length=None, scope=None):
  """Creates a recurrent neural network specified by RNNCell `cell`.
  The simplest form of RNN network generated is:
  ```python
    state = cell.zero_state(...)
    outputs = []
    for input_ in inputs:
      output, state = cell(input_, state)
      outputs.append(output)
    return (outputs, state)
  ```
  However, a few other options are available:
  An initial state can be provided.
  If the sequence_length vector is provided, dynamic calculation is performed.
  This method of calculation does not compute the RNN steps past the maximum
  sequence length of the minibatch (thus saving computational time),
  and properly propagates the state at an example's sequence length
  to the final state output.
  The dynamic calculation performed is, at time `t` for batch row `b`,
  ```python
    (output, state)(b, t) =
      (t >= sequence_length(b))
        ? (zeros(cell.output_size), states(b, sequence_length(b) - 1))
        : cell(input(b, t), state(b, t - 1))
  ```
  Args:
    cell: An instance of RNNCell.
    inputs: A length T list of inputs, each a `Tensor` of shape
      `[batch_size, input_size]`, or a nested tuple of such elements.
    initial_state: (optional) An initial state for the RNN.
      If `cell.state_size` is an integer, this must be
      a `Tensor` of appropriate type and shape `[batch_size, cell.state_size]`.
      If `cell.state_size` is a tuple, this should be a tuple of
      tensors having shapes `[batch_size, s] for s in cell.state_size`.
    dtype: (optional) The data type for the initial state and expected output.
      Required if initial_state is not provided or RNN state has a heterogeneous
      dtype.
    sequence_length: Specifies the length of each sequence in inputs.
      An int32 or int64 vector (tensor) size `[batch_size]`, values in `[0, T)`.
    scope: VariableScope for the created subgraph; defaults to "rnn".
  Returns:
    A pair (outputs, state) where:
    - outputs is a length T list of outputs (one for each input), or a nested
      tuple of such elements.
    - state is the final state
  Raises:
    TypeError: If `cell` is not an instance of RNNCell.
    ValueError: If `inputs` is `None` or an empty list, or if the input depth
      (column size) cannot be inferred from inputs via shape inference.
  """

  if not isinstance(cell, core_rnn_cell.RNNCell):
    raise TypeError("cell must be an instance of RNNCell")
  if not nest.is_sequence(inputs):
    raise TypeError("inputs must be a sequence")
  if not inputs:
    raise ValueError("inputs must not be empty")

  outputs = []
  # Create a new scope in which the caching device is either
  # determined by the parent scope, or is set to place the cached
  # Variable using the same placement as for the rest of the RNN.
  with vs.variable_scope(scope or "rnn") as varscope:
    if varscope.caching_device is None:
      varscope.set_caching_device(lambda op: op.device)

    # Obtain the first sequence of the input
    first_input = inputs
    while nest.is_sequence(first_input):
      first_input = first_input[0]

    # Temporarily avoid EmbeddingWrapper and seq2seq badness
    # TODO(lukaszkaiser): remove EmbeddingWrapper
    if first_input.get_shape().ndims != 1:

      input_shape = first_input.get_shape().with_rank_at_least(2)
      fixed_batch_size = input_shape[0]

      flat_inputs = nest.flatten(inputs)
      for flat_input in flat_inputs:
        input_shape = flat_input.get_shape().with_rank_at_least(2)
        batch_size, input_size = input_shape[0], input_shape[1:]
        fixed_batch_size.merge_with(batch_size)
        for i, size in enumerate(input_size):
          if size.value is None:
            raise ValueError(
                "Input size (dimension %d of inputs) must be accessible via "
                "shape inference, but saw value None." % i)
    else:
      fixed_batch_size = first_input.get_shape().with_rank_at_least(1)[0]

    if fixed_batch_size.value:
      batch_size = fixed_batch_size.value
    else:
      batch_size = array_ops.shape(first_input)[0]
    if initial_state is not None:
      state = initial_state
    else:
      if not dtype:
        raise ValueError("If no initial_state is provided, "
                         "dtype must be specified")
      state = cell.zero_state(batch_size, dtype)

    if sequence_length is not None:  # Prepare variables
      sequence_length = ops.convert_to_tensor(
          sequence_length, name="sequence_length")
      if sequence_length.get_shape().ndims not in (None, 1):
        raise ValueError(
            "sequence_length must be a vector of length batch_size")
      def _create_zero_output(output_size):
        # convert int to TensorShape if necessary
        size = _state_size_with_prefix(output_size, prefix=[batch_size])
        output = array_ops.zeros(
            array_ops.stack(size), _infer_state_dtype(dtype, state))
        shape = _state_size_with_prefix(
            output_size, prefix=[fixed_batch_size.value])
        output.set_shape(tensor_shape.TensorShape(shape))
        return output

      output_size = cell.output_size
      flat_output_size = nest.flatten(output_size)
      flat_zero_output = tuple(
          _create_zero_output(size) for size in flat_output_size)
      zero_output = nest.pack_sequence_as(structure=output_size,
                                          flat_sequence=flat_zero_output)

      sequence_length = math_ops.to_int32(sequence_length)
      min_sequence_length = math_ops.reduce_min(sequence_length)
      max_sequence_length = math_ops.reduce_max(sequence_length)

    for time, input_ in enumerate(inputs):
      if time > 0: varscope.reuse_variables()
      # pylint: disable=cell-var-from-loop
      call_cell = lambda: cell(input_, state)
      # pylint: enable=cell-var-from-loop
      if sequence_length is not None:
        (output, state) = _rnn_step(
            time=time,
            sequence_length=sequence_length,
            min_sequence_length=min_sequence_length,
            max_sequence_length=max_sequence_length,
            zero_output=zero_output,
            state=state,
            call_cell=call_cell,
            state_size=cell.state_size)
      else:
        (output, state) = call_cell()

      outputs.append(output)

    return (outputs, state)

*. embedding_attention_seq2seq追溯

embedding_attention_decoder -> attention_decoder

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py
def attention_decoder(decoder_inputs,
                      initial_state,
                      attention_states,
                      cell,
                      output_size=None,
                      num_heads=1,
                      loop_function=None,
                      dtype=None,
                      scope=None,
                      initial_state_attention=False):
  """RNN decoder with attention for the sequence-to-sequence model.
  In this context "attention" means that, during decoding, the RNN can look up
  information in the additional tensor attention_states, and it does this by
  focusing on a few entries from the tensor. This model has proven to yield
  especially good results in a number of sequence-to-sequence tasks. This
  implementation is based on http://arxiv.org/abs/1412.7449 (see below for
  details). It is recommended for complex sequence-to-sequence tasks.
  Args:
    decoder_inputs: A list of 2D Tensors [batch_size x input_size].
    initial_state: 2D Tensor [batch_size x cell.state_size].
    attention_states: 3D Tensor [batch_size x attn_length x attn_size].
    cell: core_rnn_cell.RNNCell defining the cell function and size.
    output_size: Size of the output vectors; if None, we use cell.output_size.
    num_heads: Number of attention heads that read from attention_states.
    loop_function: If not None, this function will be applied to i-th output
      in order to generate i+1-th input, and decoder_inputs will be ignored,
      except for the first element ("GO" symbol). This can be used for decoding,
      but also for training to emulate http://arxiv.org/abs/1506.03099.
      Signature -- loop_function(prev, i) = next
        * prev is a 2D Tensor of shape [batch_size x output_size],
        * i is an integer, the step number (when advanced control is needed),
        * next is a 2D Tensor of shape [batch_size x input_size].
    dtype: The dtype to use for the RNN initial state (default: tf.float32).
    scope: VariableScope for the created subgraph; default: "attention_decoder".
    initial_state_attention: If False (default), initial attentions are zero.
      If True, initialize the attentions from the initial state and attention
      states -- useful when we wish to resume decoding from a previously
      stored decoder state and attention states.
  Returns:
    A tuple of the form (outputs, state), where:
      outputs: A list of the same length as decoder_inputs of 2D Tensors of
        shape [batch_size x output_size]. These represent the generated outputs.
        Output i is computed from input i (which is either the i-th element
        of decoder_inputs or loop_function(output {i-1}, i)) as follows.
        First, we run the cell on a combination of the input and previous
        attention masks:
          cell_output, new_state = cell(linear(input, prev_attn), prev_state).
        Then, we calculate new attention masks:
          new_attn = softmax(V^T * tanh(W * attention_states + U * new_state))
        and then we calculate the output:
          output = linear(cell_output, new_attn).
      state: The state of each decoder cell the final time-step.
        It is a 2D Tensor of shape [batch_size x cell.state_size].
  Raises:
    ValueError: when num_heads is not positive, there are no inputs, shapes
      of attention_states are not set, or input size cannot be inferred
      from the input.
  """
  if not decoder_inputs:
    raise ValueError("Must provide at least 1 input to attention decoder.")
  if num_heads < 1:
    raise ValueError("With less than 1 heads, use a non-attention decoder.")
  if attention_states.get_shape()[2].value is None:
    raise ValueError("Shape[2] of attention_states must be known: %s" %
                     attention_states.get_shape())
  if output_size is None:
    output_size = cell.output_size

  with variable_scope.variable_scope(
      scope or "attention_decoder", dtype=dtype) as scope:
    dtype = scope.dtype

    batch_size = array_ops.shape(decoder_inputs[0])[0]  # Needed for reshaping.
    attn_length = attention_states.get_shape()[1].value
    if attn_length is None:
      attn_length = array_ops.shape(attention_states)[1]
    attn_size = attention_states.get_shape()[2].value

    # To calculate W1 * h_t we use a 1-by-1 convolution, need to reshape before.
    hidden = array_ops.reshape(attention_states,
                               [-1, attn_length, 1, attn_size])
    hidden_features = []
    v = []
    attention_vec_size = attn_size  # Size of query vectors for attention.
    for a in xrange(num_heads):
      k = variable_scope.get_variable("AttnW_%d" % a,
                                      [1, 1, attn_size, attention_vec_size])
      hidden_features.append(nn_ops.conv2d(hidden, k, [1, 1, 1, 1], "SAME"))
      v.append(
          variable_scope.get_variable("AttnV_%d" % a, [attention_vec_size]))

    state = initial_state

    def attention(query):
      """Put attention masks on hidden using hidden_features and query."""
      ds = []  # Results of attention reads will be stored here.
      if nest.is_sequence(query):  # If the query is a tuple, flatten it.
        query_list = nest.flatten(query)
        for q in query_list:  # Check that ndims == 2 if specified.
          ndims = q.get_shape().ndims
          if ndims:
            assert ndims == 2
        query = array_ops.concat(query_list, 1)
      for a in xrange(num_heads):
        with variable_scope.variable_scope("Attention_%d" % a):
          y = linear(query, attention_vec_size, True)
          y = array_ops.reshape(y, [-1, 1, 1, attention_vec_size])
          # Attention mask is a softmax of v^T * tanh(...).
          s = math_ops.reduce_sum(v[a] * math_ops.tanh(hidden_features[a] + y),
                                  [2, 3])
          a = nn_ops.softmax(s)
          # Now calculate the attention-weighted vector d.
          d = math_ops.reduce_sum(
              array_ops.reshape(a, [-1, attn_length, 1, 1]) * hidden, [1, 2])
          ds.append(array_ops.reshape(d, [-1, attn_size]))
      return ds

    outputs = []
    prev = None
    batch_attn_size = array_ops.stack([batch_size, attn_size])
    attns = [
        array_ops.zeros(
            batch_attn_size, dtype=dtype) for _ in xrange(num_heads)
    ]
    for a in attns:  # Ensure the second shape of attention vectors is set.
      a.set_shape([None, attn_size])
    if initial_state_attention:
      attns = attention(initial_state)
    for i, inp in enumerate(decoder_inputs):
      if i > 0:
        variable_scope.get_variable_scope().reuse_variables()
      # If loop_function is set, we use it instead of decoder_inputs.
      if loop_function is not None and prev is not None:
        with variable_scope.variable_scope("loop_function", reuse=True):
          inp = loop_function(prev, i)
      # Merge input and previous attentions into one vector of the right size.
      input_size = inp.get_shape().with_rank(2)[1]
      if input_size.value is None:
        raise ValueError("Could not infer input size from input: %s" % inp.name)
      x = linear([inp] + attns, input_size, True)
      # Run the RNN.
      cell_output, state = cell(x, state)
      # Run the attention mechanism.
      if i == 0 and initial_state_attention:
        with variable_scope.variable_scope(
            variable_scope.get_variable_scope(), reuse=True):
          attns = attention(state)
      else:
        attns = attention(state)

      with variable_scope.variable_scope("AttnOutputProjection"):
        output = linear([cell_output] + attns, output_size, True)
      if loop_function is not None:
        prev = output
      outputs.append(output)

  return outputs, state


def embedding_attention_decoder(decoder_inputs,
                                initial_state,
                                attention_states,
                                cell,
                                num_symbols,
                                embedding_size,
                                num_heads=1,
                                output_size=None,
                                output_projection=None,
                                feed_previous=False,
                                update_embedding_for_previous=True,
                                dtype=None,
                                scope=None,
                                initial_state_attention=False):
  """RNN decoder with embedding and attention and a pure-decoding option.
  Args:
    decoder_inputs: A list of 1D batch-sized int32 Tensors (decoder inputs).
    initial_state: 2D Tensor [batch_size x cell.state_size].
    attention_states: 3D Tensor [batch_size x attn_length x attn_size].
    cell: core_rnn_cell.RNNCell defining the cell function.
    num_symbols: Integer, how many symbols come into the embedding.
    embedding_size: Integer, the length of the embedding vector for each symbol.
    num_heads: Number of attention heads that read from attention_states.
    output_size: Size of the output vectors; if None, use output_size.
    output_projection: None or a pair (W, B) of output projection weights and
      biases; W has shape [output_size x num_symbols] and B has shape
      [num_symbols]; if provided and feed_previous=True, each fed previous
      output will first be multiplied by W and added B.
    feed_previous: Boolean; if True, only the first of decoder_inputs will be
      used (the "GO" symbol), and all other decoder inputs will be generated by:
        next = embedding_lookup(embedding, argmax(previous_output)),
      In effect, this implements a greedy decoder. It can also be used
      during training to emulate http://arxiv.org/abs/1506.03099.
      If False, decoder_inputs are used as given (the standard decoder case).
    update_embedding_for_previous: Boolean; if False and feed_previous=True,
      only the embedding for the first symbol of decoder_inputs (the "GO"
      symbol) will be updated by back propagation. Embeddings for the symbols
      generated from the decoder itself remain unchanged. This parameter has
      no effect if feed_previous=False.
    dtype: The dtype to use for the RNN initial states (default: tf.float32).
    scope: VariableScope for the created subgraph; defaults to
      "embedding_attention_decoder".
    initial_state_attention: If False (default), initial attentions are zero.
      If True, initialize the attentions from the initial state and attention
      states -- useful when we wish to resume decoding from a previously
      stored decoder state and attention states.
  Returns:
    A tuple of the form (outputs, state), where:
      outputs: A list of the same length as decoder_inputs of 2D Tensors with
        shape [batch_size x output_size] containing the generated outputs.
      state: The state of each decoder cell at the final time-step.
        It is a 2D Tensor of shape [batch_size x cell.state_size].
  Raises:
    ValueError: When output_projection has the wrong shape.
  """
  if output_size is None:
    output_size = cell.output_size
  if output_projection is not None:
    proj_biases = ops.convert_to_tensor(output_projection[1], dtype=dtype)
    proj_biases.get_shape().assert_is_compatible_with([num_symbols])

  with variable_scope.variable_scope(
      scope or "embedding_attention_decoder", dtype=dtype) as scope:

    embedding = variable_scope.get_variable("embedding",
                                            [num_symbols, embedding_size])
    loop_function = _extract_argmax_and_embed(
        embedding, output_projection,
        update_embedding_for_previous) if feed_previous else None
    emb_inp = [
        embedding_ops.embedding_lookup(embedding, i) for i in decoder_inputs
    ]
    return attention_decoder(
        emb_inp,
        initial_state,
        attention_states,
        cell,
        output_size=output_size,
        num_heads=num_heads,
        loop_function=loop_function,
        initial_state_attention=initial_state_attention)
def embedding_attention_seq2seq(encoder_inputs,
                                decoder_inputs,
                                cell,
                                num_encoder_symbols,
                                num_decoder_symbols,
                                embedding_size,
                                num_heads=1,
                                output_projection=None,
                                feed_previous=False,
                                dtype=None,
                                scope=None,
                                initial_state_attention=False):
  """Embedding sequence-to-sequence model with attention.
  This model first embeds encoder_inputs by a newly created embedding (of shape
  [num_encoder_symbols x input_size]). Then it runs an RNN to encode
  embedded encoder_inputs into a state vector. It keeps the outputs of this
  RNN at every step to use for attention later. Next, it embeds decoder_inputs
  by another newly created embedding (of shape [num_decoder_symbols x
  input_size]). Then it runs attention decoder, initialized with the last
  encoder state, on embedded decoder_inputs and attending to encoder outputs.
  Warning: when output_projection is None, the size of the attention vectors
  and variables will be made proportional to num_decoder_symbols, can be large.
  Args:
    encoder_inputs: A list of 1D int32 Tensors of shape [batch_size].
    decoder_inputs: A list of 1D int32 Tensors of shape [batch_size].
    cell: core_rnn_cell.RNNCell defining the cell function and size.
    num_encoder_symbols: Integer; number of symbols on the encoder side.
    num_decoder_symbols: Integer; number of symbols on the decoder side.
    embedding_size: Integer, the length of the embedding vector for each symbol.
    num_heads: Number of attention heads that read from attention_states.
    output_projection: None or a pair (W, B) of output projection weights and
      biases; W has shape [output_size x num_decoder_symbols] and B has
      shape [num_decoder_symbols]; if provided and feed_previous=True, each
      fed previous output will first be multiplied by W and added B.
    feed_previous: Boolean or scalar Boolean Tensor; if True, only the first
      of decoder_inputs will be used (the "GO" symbol), and all other decoder
      inputs will be taken from previous outputs (as in embedding_rnn_decoder).
      If False, decoder_inputs are used as given (the standard decoder case).
    dtype: The dtype of the initial RNN state (default: tf.float32).
    scope: VariableScope for the created subgraph; defaults to
      "embedding_attention_seq2seq".
    initial_state_attention: If False (default), initial attentions are zero.
      If True, initialize the attentions from the initial state and attention
      states.
  Returns:
    A tuple of the form (outputs, state), where:
      outputs: A list of the same length as decoder_inputs of 2D Tensors with
        shape [batch_size x num_decoder_symbols] containing the generated
        outputs.
      state: The state of each decoder cell at the final time-step.
        It is a 2D Tensor of shape [batch_size x cell.state_size].
  """
  with variable_scope.variable_scope(
      scope or "embedding_attention_seq2seq", dtype=dtype) as scope:
    dtype = scope.dtype
    # Encoder.
    encoder_cell = copy.deepcopy(cell)
    encoder_cell = core_rnn_cell.EmbeddingWrapper(
        encoder_cell,
        embedding_classes=num_encoder_symbols,
        embedding_size=embedding_size)
    encoder_outputs, encoder_state = core_rnn.static_rnn(
        encoder_cell, encoder_inputs, dtype=dtype)

    # First calculate a concatenation of encoder outputs to put attention on.
    top_states = [
        array_ops.reshape(e, [-1, 1, cell.output_size]) for e in encoder_outputs
    ]
    attention_states = array_ops.concat(top_states, 1)

    # Decoder.
    output_size = None
    if output_projection is None:
      cell = core_rnn_cell.OutputProjectionWrapper(cell, num_decoder_symbols)
      output_size = num_decoder_symbols

    if isinstance(feed_previous, bool):
      return embedding_attention_decoder(
          decoder_inputs,
          encoder_state,
          attention_states,
          cell,
          num_decoder_symbols,
          embedding_size,
          num_heads=num_heads,
          output_size=output_size,
          output_projection=output_projection,
          feed_previous=feed_previous,
          initial_state_attention=initial_state_attention)

    # If feed_previous is a Tensor, we construct 2 graphs and use cond.
    def decoder(feed_previous_bool):
      reuse = None if feed_previous_bool else True
      with variable_scope.variable_scope(
          variable_scope.get_variable_scope(), reuse=reuse):
        outputs, state = embedding_attention_decoder(
            decoder_inputs,
            encoder_state,
            attention_states,
            cell,
            num_decoder_symbols,
            embedding_size,
            num_heads=num_heads,
            output_size=output_size,
            output_projection=output_projection,
            feed_previous=feed_previous_bool,
            update_embedding_for_previous=False,
            initial_state_attention=initial_state_attention)
        state_list = [state]
        if nest.is_sequence(state):
          state_list = nest.flatten(state)
        return outputs + state_list

    outputs_and_state = control_flow_ops.cond(feed_previous,
                                              lambda: decoder(True),
                                              lambda: decoder(False))
    outputs_len = len(decoder_inputs)  # Outputs length same as decoder inputs.
    state_list = outputs_and_state[outputs_len:]
    state = state_list[0]
    if nest.is_sequence(encoder_state):
      state = nest.pack_sequence_as(
          structure=encoder_state, flat_sequence=state_list)
    return outputs_and_state[:outputs_len], state

6. model_with_buckets追溯

def sequence_loss_by_example(logits,
                             targets,
                             weights,
                             average_across_timesteps=True,
                             softmax_loss_function=None,
                             name=None):
  """Weighted cross-entropy loss for a sequence of logits (per example).
  Args:
    logits: List of 2D Tensors of shape [batch_size x num_decoder_symbols].
    targets: List of 1D batch-sized int32 Tensors of the same length as logits.
    weights: List of 1D batch-sized float-Tensors of the same length as logits.
    average_across_timesteps: If set, divide the returned cost by the total
      label weight.
    softmax_loss_function: Function (labels, logits) -> loss-batch
      to be used instead of the standard softmax (the default if this is None).
      **Note that to avoid confusion, it is required for the function to accept
      named arguments.**
    name: Optional name for this operation, default: "sequence_loss_by_example".
  Returns:
    1D batch-sized float Tensor: The log-perplexity for each sequence.
  Raises:
    ValueError: If len(logits) is different from len(targets) or len(weights).
  """
  if len(targets) != len(logits) or len(weights) != len(logits):
    raise ValueError("Lengths of logits, weights, and targets must be the same "
                     "%d, %d, %d." % (len(logits), len(weights), len(targets)))
  with ops.name_scope(name, "sequence_loss_by_example",
                      logits + targets + weights):
    log_perp_list = []
    for logit, target, weight in zip(logits, targets, weights):
      if softmax_loss_function is None:
        # TODO(irving,ebrevdo): This reshape is needed because
        # sequence_loss_by_example is called with scalars sometimes, which
        # violates our general scalar strictness policy.
        target = array_ops.reshape(target, [-1])
        crossent = nn_ops.sparse_softmax_cross_entropy_with_logits(
            labels=target, logits=logit)
      else:
        crossent = softmax_loss_function(labels=target, logits=logit)
      log_perp_list.append(crossent * weight)
    log_perps = math_ops.add_n(log_perp_list)
    if average_across_timesteps:
      total_size = math_ops.add_n(weights)
      total_size += 1e-12  # Just to avoid division by 0 for all-0 weights.
      log_perps /= total_size
  return log_perps

def sequence_loss(logits,
                  targets,
                  weights,
                  average_across_timesteps=True,
                  average_across_batch=True,
                  softmax_loss_function=None,
                  name=None):
  """Weighted cross-entropy loss for a sequence of logits, batch-collapsed.
  Args:
    logits: List of 2D Tensors of shape [batch_size x num_decoder_symbols].
    targets: List of 1D batch-sized int32 Tensors of the same length as logits.
    weights: List of 1D batch-sized float-Tensors of the same length as logits.
    average_across_timesteps: If set, divide the returned cost by the total
      label weight.
    average_across_batch: If set, divide the returned cost by the batch size.
    softmax_loss_function: Function (labels, logits) -> loss-batch
      to be used instead of the standard softmax (the default if this is None).
      **Note that to avoid confusion, it is required for the function to accept
      named arguments.**
    name: Optional name for this operation, defaults to "sequence_loss".
  Returns:
    A scalar float Tensor: The average log-perplexity per symbol (weighted).
  Raises:
    ValueError: If len(logits) is different from len(targets) or len(weights).
  """
  with ops.name_scope(name, "sequence_loss", logits + targets + weights):
    cost = math_ops.reduce_sum(
        sequence_loss_by_example(
            logits,
            targets,
            weights,
            average_across_timesteps=average_across_timesteps,
            softmax_loss_function=softmax_loss_function))
    if average_across_batch:
      batch_size = array_ops.shape(targets[0])[0]
      return cost / math_ops.cast(batch_size, cost.dtype)
    else:
      return cost


def model_with_buckets(encoder_inputs,
                       decoder_inputs,
                       targets,
                       weights,
                       buckets,
                       seq2seq,
                       softmax_loss_function=None,
                       per_example_loss=False,
                       name=None):
  """Create a sequence-to-sequence model with support for bucketing.
  The seq2seq argument is a function that defines a sequence-to-sequence model,
  e.g., seq2seq = lambda x, y: basic_rnn_seq2seq(
      x, y, core_rnn_cell.GRUCell(24))
  Args:
    encoder_inputs: A list of Tensors to feed the encoder; first seq2seq input.
    decoder_inputs: A list of Tensors to feed the decoder; second seq2seq input.
    targets: A list of 1D batch-sized int32 Tensors (desired output sequence).
    weights: List of 1D batch-sized float-Tensors to weight the targets.
    buckets: A list of pairs of (input size, output size) for each bucket.
    seq2seq: A sequence-to-sequence model function; it takes 2 input that
      agree with encoder_inputs and decoder_inputs, and returns a pair
      consisting of outputs and states (as, e.g., basic_rnn_seq2seq).
    softmax_loss_function: Function (labels, logits) -> loss-batch
      to be used instead of the standard softmax (the default if this is None).
      **Note that to avoid confusion, it is required for the function to accept
      named arguments.**
    per_example_loss: Boolean. If set, the returned loss will be a batch-sized
      tensor of losses for each sequence in the batch. If unset, it will be
      a scalar with the averaged loss from all examples.
    name: Optional name for this operation, defaults to "model_with_buckets".
  Returns:
    A tuple of the form (outputs, losses), where:
      outputs: The outputs for each bucket. Its j'th element consists of a list
        of 2D Tensors. The shape of output tensors can be either
        [batch_size x output_size] or [batch_size x num_decoder_symbols]
        depending on the seq2seq model used.
      losses: List of scalar Tensors, representing losses for each bucket, or,
        if per_example_loss is set, a list of 1D batch-sized float Tensors.
  Raises:
    ValueError: If length of encoder_inputs, targets, or weights is smaller
      than the largest (last) bucket.
  """
  if len(encoder_inputs) < buckets[-1][0]:
    raise ValueError("Length of encoder_inputs (%d) must be at least that of la"
                     "st bucket (%d)." % (len(encoder_inputs), buckets[-1][0]))
  if len(targets) < buckets[-1][1]:
    raise ValueError("Length of targets (%d) must be at least that of last"
                     "bucket (%d)." % (len(targets), buckets[-1][1]))
  if len(weights) < buckets[-1][1]:
    raise ValueError("Length of weights (%d) must be at least that of last"
                     "bucket (%d)." % (len(weights), buckets[-1][1]))

  all_inputs = encoder_inputs + decoder_inputs + targets + weights
  losses = []
  outputs = []
  with ops.name_scope(name, "model_with_buckets", all_inputs):
    for j, bucket in enumerate(buckets):
      with variable_scope.variable_scope(
          variable_scope.get_variable_scope(), reuse=True if j > 0 else None):
        bucket_outputs, _ = seq2seq(encoder_inputs[:bucket[0]],
                                    decoder_inputs[:bucket[1]])
        outputs.append(bucket_outputs)
        if per_example_loss:
          losses.append(
              sequence_loss_by_example(
                  outputs[-1],
                  targets[:bucket[1]],
                  weights[:bucket[1]],
                  softmax_loss_function=softmax_loss_function))
        else:
          losses.append(
              sequence_loss(
                  outputs[-1],
                  targets[:bucket[1]],
                  weights[:bucket[1]],
                  softmax_loss_function=softmax_loss_function))

  return outputs, losses

你可能感兴趣的:(学习笔记)

【一起学Rust | 设计模式】习惯语法——使用借用类型作为参数、格式化拼接字符串、构造函数广龙宇一起学Rust #Rust设计模式 rust 设计模式开发语言
提示：文章写完后，目录可以自动生成，如何生成可参考右边的帮助文档文章目录前言一、使用借用类型作为参数二、格式化拼接字符串三、使用构造函数总结前言Rust不是传统的面向对象编程语言，它的所有特性，使其独一无二。因此，学习特定于Rust的设计模式是必要的。本系列文章为作者学习《Rust设计模式》的学习笔记以及自己的见解。因此，本系列文章的结构也与此书的结构相同（后续可能会调成结构），基本上分为三个部分
四章-32-点要素的聚合彩云飘过
本文基于腾讯课堂老胡的课《跟我学Openlayers--基础实例详解》做的学习笔记，使用的openlayers5.3.xapi。源码见1032.html，对应的官网示例https://openlayers.org/en/latest/examples/cluster.htmlhttps://openlayers.org/en/latest/examples/earthquake-clusters.
语文主题教学学习笔记之87 东哥杂谈
“语文主题教学”学习笔记之八十七（0125）今天继续学习小学语文主题教学的实践样态。板块三：教学中体现“书艺”味道。作为四大名著之一的《水浒传》，堪称我国文学宝库之经典。对从《水浒传》中摘选的单元，教师就要了解其原生态，即评书体特点。这也要求教师要了解一些常用的评书行话术语，然后在教学时适时地加入一些，让学生体味其文本中原有的特色。学生也要尽可能地通过朗读的方式，而不单是分析讲解的方式进行学习。细
《转介绍方法论》学习笔记小可乐的妈妈
一、高效转介绍的流程：价值观---执行----方案一）转介绍发生的背景：1、对象：谁向谁转介绍？全员营销，人人参与。①员工的激励政策、客户的转介绍诱因制作客户画像：a信任；支付能力；意愿度；便利度（根据家长具备四个特征的个数分为四类）B性格分类C职业分类D年龄性别②执行：套路，策略，方法，流程2、诱因：为什么要转介绍？认同信任；多方共赢；传递美好；零风险承诺打动人心，超越期待。选择做教育，就是选择
JAVA学习笔记之23种设计模式学习 victorfreedom Java技术设计模式 android java 常用设计模式
博主最近买了《设计模式》这本书来学习，无奈这本书是以C++语言为基础进行说明，整个学习流程下来效率不是很高，虽然有的设计模式通俗易懂，但感觉还是没有充分的掌握了所有的设计模式。于是博主百度了一番，发现有大神写过了这方面的问题，于是博主迅速拿来学习。一、设计模式的分类总体来说设计模式分为三大类：创建型模式，共五种：工厂方法模式、抽象工厂模式、单例模式、建造者模式、原型模式。结构型模式，共七种：适配器
新能源汽车 BMS 学习笔记篇—BMS 基本定义及分类 WPG大大通其他笔记汽车 BMS 经验分享新能源电池
一、BMS定义1、概念：BMS（BatteryManagementSystem）即电池管理系统，其管理对象是二次电池（充电电池或蓄电池），其主要目的是电池的利用率，防止电池出现过度充电和过度放电，可应用于电动汽车、电瓶车、机器人、无人机等图片来源：腾讯网https://new.qq.com《标准普尔警告，电动汽车电池生产面临供应链和地缘政治风险》2、四大功能①感知和测量：检测电池的电压、电流、温度
吴恩达深度学习笔记(30)-正则化的解释极客Array
正则化（Regularization）深度学习可能存在过拟合问题——高方差，有两个解决方法，一个是正则化，另一个是准备更多的数据，这是非常可靠的方法，但你可能无法时时刻刻准备足够多的训练数据或者获取更多数据的成本很高，但正则化通常有助于避免过拟合或减少你的网络误差。如果你怀疑神经网络过度拟合了数据，即存在高方差问题，那么最先想到的方法可能是正则化，另一个解决高方差的方法就是准备更多数据，这也是非常
个人学习笔记7-6：动手学深度学习pytorch版-李沐浪子L 深度学习深度学习笔记计算机视觉 python 人工智能神经网络 pytorch
#人工智能##深度学习##语义分割##计算机视觉##神经网络#计算机视觉13.11全卷积网络全卷积网络（fullyconvolutionalnetwork，FCN）采用卷积神经网络实现了从图像像素到像素类别的变换。引入l转置卷积（transposedconvolution）实现的，输出的类别预测与输入图像在像素级别上具有一一对应关系：通道维的输出即该位置对应像素的类别预测。13.11.1构造模型下
golang学习笔记--MPG模型 xxzed golang #学习笔记学习笔记 golang
MPG模式：M（Machine）：操作系统的主线程P（Processor）：协程执行需要的资源（上下文context），可以看作一个局部的调度器，使go代码在一个线程上跑，他是实现从N：1到N：M映射的关键G（Goroutine）：协程，有自己的栈。包含指令指针（instructionpointer）和其它信息（正在等待的channel等等），用于调度。一个P下面可以有多个G1、当前程序有三个M,
碎片化学习笔记分享剑客写作
现在生活节奏很快，学习力成为了我们拥有的最大财富。碎片化学习是最好的。首先，不要太过自信，学会虚心学习，是我们面对现实的好方法，才能够常保新鲜。平时我们要拥有什么工具呢？1.思维导图2.写在印象笔记里3.听书，消燥耳机4.教学输出5.录音笔里面最好的方式就是教学输出法，记忆里最好。当输出时我们集中精力记忆里最好。有人认为缩短睡眠时间来学习，其实最好的方式是保持最好的睡眠，记忆力会更好。剥夺睡眠，会
《随园诗话》学习笔记三百零六飞鸿雪舞
卷五凡诗之传者，都在灵性五、五斗米与诗【原文】丁丑，余觅一抄书人，或荐黄生，名之纪，号星岩者，人甚朴野。偶过其案头，得句云；“破庵僧卖临街瓦，独井人争向晚泉。”余大奇之，即饷米五斗。自此欣然大用力于诗。五言句云：“云开日脚直，雨落水纹圆。竹锐穿泥壁，蝇酣落酒尊。钓久知鱼性，樵多识树名。笔残芦并用，墨尽指同磨。＂七言云：＂小窗近水寒偏觉，古木遮天曙不知。旧生萍处泥犹绿，新落花时水亦香。旧甓恐闲都贮水
D15 论语学习笔记许小兔Angelina
悟：上级对下级的宽容：凡事成定局，就不你说了；已接近完结的事，也没必要匡正和挽回了；既然是过去的事，也没必要追究得失和责任了。对待孩子教育也是，不用“问责制”，这样容易让孩子因为害怕担责而说谎。应当循循善诱，避免再犯错才是最重要的。3.16：【原文】子曰：“射不主皮，为力不同科，古之道也。”【译文】孔子说：“射箭比赛不以射透为主，而主要看是否射得准确，因为人的力量不同，自古如此。”3.17：【原文
网络工程师学习笔记（一）专业白嫖怪网络工程师学习笔记学习笔记网络
为了备战下半年的软考——网络工程师，利用每天的下班的闲暇时间看书听课，然后自己手敲整理的系列资料。希望能够对你们有所帮助第一章__计算机网络概述计算机网络的定义：将分散的具有独立运算功能的计算机系统，通过通信线路和通信设备进行连接起来的实现资源的共享。ARPAnet网络的特征：资源共享、分散控制、分组交换1946年第一台通用计算机—埃尼亚克能够相互连通进行数据交换。1960年提出巨型网络，出现了对
K8S学习笔记02——K8S组件沉淅尘 #Docker #K8S kubernetes
Kubernetes组件一、控制平面组件（ControlPlaneComponents）(1)kube-apiserver(2)etcd(3)kube-scheduler(4)kube-controller-manager(5)cloud-controller-manager二、Node组件1.kubelet2.kube-proxy3.容器运行时（ContainerRuntime）三、插件（Add
「Python」2020.04.08学习笔记 | 第六章文件（a+）模式+把随机手机号写入文件小练习 Yetta的书影屋
学习测试开发的Day97，真棒！学习时间为40M第九次全天课(下午视频二20M-50M）>>>fp.seek(0)0>>>fp.read()'你好11你好12你好13你好14你好15\n你好16\n你好17\n你好18\n'>>>fp.seek(0,0)0>>>fp.write("*********************************\n")34>>>fp.seek(0,0)0>>>f
《金文成〈中庸〉学习笔记401。2020-2-24》金吾生
《金文成〈中庸〉学习笔记401。2020-2-24》今天是庚子年戊寅月丁酉日，二月初二，2020年2月24日星期一。二月二龙抬头。第二十二章【唯天下至诚，为能尽其性；能尽其性，则能尽人之性；能尽人之性，则能尽物之性；能尽物之性，则能赞天地之化育；能赞天地之化育，则可以与天地参矣。】上一节，船山讲解说，性作为天用之本体，于圣人和匹夫匹妇而言并无二致，区别来自于诚。诚的区别来自于纯粹与掺杂。掺杂什么呢
CDGA学习笔记三-《数据安全》 zy_chris 网络安全
七、数据安全7.1引言数据安全包括安全策略和过程的规划、建立与执行，为数据和信息资产提供正确的身份验证、授权、访问和审计。要求来自以下方面：（1）利益相关方（2）政府法规（3）特定业务关注点（4）合法访问需求（5）合同义务7.1.1业务驱动因素1、降低风险信息安全首先对组织数据进行分级分类，对组织数据进行分类分级的整个流程：1）识别敏感数据资产并分类分级2）在企业中查找敏感数据3）确定保护每项资产
vue学习笔记——关于对Vue3 ref(), toRef(), toRefs(), unref(), isRef(), reactive()方法的理解。 chen_sir_sh vue学习笔记 javascript 前端 vue
VUE3出现了很多新的API，下面是自己的一些理解进行的总结。欢迎大家一起交流补充。ref()使用ref创建一个数据类型，ref有value这个属性constname1={age:"14",name:"bob1"};constname2=ref({name:"bob2"});//使用ref创建一个数据类型相对于reactive，ref有value属性name2.value="bob3"consol
遇到僵尸进程，怎么处理---学习笔记 summer@彤妈性能优化 linux
僵尸进程解释当iowait升高时，进程很可能因为得不到硬件的响应，而长时间处于不可中断状态。从ps或者top命令的输出中，你可以发现它们都处于D状态，也就是不可中断状态（UninterruptibleSleep）。既然说到了进程的状态，进程有哪些状态你还记得吗？我们先来回顾一下。top和ps是最常用的查看进程状态的工具，我们就从top的输出开始。下面是一个top命令输出的示例，S列（也就是Stat
C++学习笔记----6、内存管理（五）---- 智能指针（3）王俊山IT c++学习笔记开发语言
2、shared_ptr有时候吧，有些对象或者一部分代码需要同一个指针的拷贝。那么unique_ptr不能被拷贝，因此就不能用于些场景。这样的话，std::shared_ptr就是一个支持能够被拷贝的拥有共享属主的智能指针。但是，如果有指向同一个资源的多个shared_ptr实例，那么怎么知道什么时候去释放资源呢？这可以通过对于引用记数来解决，这个我们以后再聊。首先，让我们看一下怎么构造与使用sh
【学习笔记】武志红心理学—潜意识决定命运万万千千
冰山一角什么构成了我们的命运？命运是由我们的显意识和潜意识来决定的。我们可以用一张图做一个比喻。看过“冰山一角”图片的都知道，潜意识就是水面以下的部分，显意识是水面以上的部分，从体积来看，潜意识占了大部分，而显意识只是冰山一角，纵向来看，庞大的潜意识支撑着冰山一角的显意识，才得以让冰山漂浮在水面。延伸到我们的人生，我们对自己显意识层面的想法很容易感知到，所以我们会说这是“我”自己做的选择。而潜意识
Prism 教程 yang_B621 Prism IOC
http://t.csdnimg.cn/VXSSvhttps://blog.csdn.net/u010476739/article/details/119341731Prism-随笔分类-Hello——寻梦者！-博客园(cnblogs.com)C#IoC学习笔记-缥缈的尘埃-博客园(cnblogs.com)WPF_SchuylerEX的博客-CSDN博客
绘本讲师训练营【第30期】2/21阅读原创《绘本之力》学习笔记2 郑贤钰
30028郑贤钰今天读了绘本之力《留在灵魂里的东西》读了心里有非常大的感触！两个年幼什么都不懂的孩子，为了自己心爱的东西，攒下来自己的零花钱，却买了一个自己不知道怎么用的东西，当他们觉得这个东西根本就不好，准备扔掉的时候，这是故事中的有趣有爱的老爷爷出现了，帮助孩子们再一次发现之前别人拉出优美的音乐，原来自己买的这一个琴，自认为没用的琴也能够经过老爷爷熟练的演奏也能拉出这样优美的声音，这让孩子们十
仿老师悟耕海者
毕业十年了，今天去拜访老师，看到老师的学习笔记，看到老师努力学习，积极提高的状态，我觉着自己真是有些懈怠了，孩子们，老师的老师都在孜孜不倦，我们岂能偷懒！
C++学习笔记----7、使用类与对象获得高性能（一）---- 书写类（2）王俊山IT c++学习笔记开发语言
2.2、定义成员函数前面对SpreadsheetCell类的定义足以让你生成类的对象。然而，如果想调用setValue()或者getValue()成员函数，连接器就会抱怨这些函数没有定义。这是因为到目前为止，这些成员函数只有原型，而还没有实现。通常，类的定义会在模块接口文件。对于成员函数的定义，你有一个选择：可以在模块定义文件或者在模块实现文件。下面是SpreadsheetCell类，在类内对成员
Spring6学习笔记4：事务 ·云扬· SSM Java #Spring 学习笔记 spring
1JdbcTemplate1.1简介Spring框架对JDBC进行封装，使用JdbcTemplate方便实现对数据库操作准备工作①搭建子模块搭建子模块：spring-jdbc-tx②加入依赖org.springframeworkspring-jdbc6.0.2mysqlmysql-connector-java8.0.30com.alibabadruid1.2.15③创建jdbc.propertie
连通无向图一般中心的算法及其matlab程序详解夏天天天天天天天# 图论算法 matlab 图论
#################本文为学习《图论算法及其MATLAB实现》的学习笔记#################若服务点只允许取在各顶点上,而服务对象却取在各顶点及各边(或弧)上的点,则在所有顶点中选定一个顶点作为图的一般中心其条件是该点离它本身的最远服务对象(包括顶点及各边(或弧)上的点)的距离达到极小值。寻找无向图的一般中心对解决网络最佳服务点确定的问题是十分有效的，使得服务对象的范围
学习笔记：FW内容安全概述 TKE_yinian
内容安全概述信息安全概述主要威胁关于防护简介内容安全威胁应用层威胁内容安全技术WEB安全应用安全入侵防御检测邮件安全数据安全网络安全反病毒全局环境感知沙箱检测信息安全概述•信息安全是对信息和信息系统进行保护，防止未授权的访问、使用、泄露、中断、修改、破坏并以此提供保密性、完整性和可用性。•为关键资产提供机密性、完整性和可用性（CIA三元组）保护是信息安全的核心目标。CIA（Confidential
java的socket实现一个九宫棋游戏睡不醒的小泽
前言一个简单的socket小作品=v=一个机酱在大三实验课中接触到很基础的JAVA语言socket编程。至于你问为什么嵌入式的机酱会弄些Java吗？emmmmm，可能是当初C语言版的不够好玩吧，另外如果碰巧有用，欢迎抱走的yoo在之前的笔记《网络基础知识和网络编程》中有讲解过关于网络编程的一些基本知识，以及一些LinuxC的socket编程，希望粗浅了解socket内部肌理的同学，右转咱的学习笔记
【每日一词】D33 edge 宠辱不惊的中年少女
1）学习笔记：edge：优势，=advantagebeanabsoluteedge有绝对优势AhasanedgeoverB表示A比B更好maintainone'sedge保持优势loseone'sedge失去优势innovativeedge创新方面的优势2）查字典延伸：A.就工作经验而言，她显然要比我们面试过的其他人都胜出一筹。Intermsofexperience,shedefinitelyha
java Illegal overloaded getter method with ambiguous type for propert的解决 zwllxs java jdk
好久不来iteye,今天又来看看，哈哈,今天碰到在编码时，反射中会抛出 Illegal overloaded getter method with ambiguous type for propert这么个东东，从字面意思看，是反射在获取getter时迷惑了，然后回想起java在boolean值在生成getter时，分别有is和getter，也许我们的反射对象中就有is开头的方法迷惑了jdk，
IT人应当知道的10个行业小内幕 beijingjava 工作互联网
10. 虽然IT业的薪酬比其他很多行业要好，但有公司因此视你为其“佣人”。　　尽管IT人士的薪水没有互联网泡沫之前要好，但和其他行业人士比较，IT人的薪资还算好点。在接下的几十年中，科技在商业和社会发展中所占分量会一直增加，所以我们完全有理由相信，IT专业人才的需求量也不会减少。　　然而，正因为IT人士的薪水普遍较高，所以有些公司认为给了你这么多钱，就把你看成是公司的“佣人”，拥有你的支配
java 实现自定义链表 CrazyMizzz java 数据结构
1.链表结构链表是链式的结构 2.链表的组成链表是由头节点，中间节点和尾节点组成节点是由两个部分组成： 1.数据域 2.引用域 3.链表的实现 &nbs
web项目发布到服务器后图片过一会儿消失麦田的设计者 struts2 上传图片永久保存
作为一名学习了android和j2ee的程序员，我们必须要意识到，客服端和服务器端的交互是很有必要的，比如你用eclipse写了一个web工程，并且发布到了服务器（tomcat）上，这时你在webapps目录下看到了你发布的web工程，你可以打开电脑的浏览器输入http://localhost:8080/工程/路径访问里面的资源。但是，有时你会突然的发现之前用struts2上传的图片
CodeIgniter框架Cart类 name 不能设置中文的解决方法 IT独行者 CodeIgniter Cart 框架　
今天试用了一下CodeIgniter的Cart类时遇到了个小问题，发现当name的值为中文时，就写入不了session。在这里特别提醒一下。在CI手册里也有说明，如下： $data = array( 'id' => 'sku_123ABC', 'qty' => 1, '
linux回收站 _wy_ linux 回收站
今天一不小心在ubuntu下把一个文件移动到了回收站，我并不想删，手误了。我急忙到Nautilus下的回收站中准备恢复它，但是里面居然什么都没有。后来我发现这是由于我删文件的地方不在HOME所在的分区，而是在另一个独立的Linux分区下，这是我专门用于开发的分区。而我删除的东东在分区根目录下的.Trash-1000/file目录下，相关的删除信息（删除时间和文件所在
jquery回到页面顶端知了ing html jquery css
html代码： <h1 id="anchor">页面标题</h1> <div id="container">页面内容</div> <p><a href="#anchor" class="topLink">回到顶端</a><
B树、B-树、B+树、B*树矮蛋蛋 B树
原文地址： http://www.cnblogs.com/oldhorse/archive/2009/11/16/1604009.html B树即二叉搜索树： 1.所有非叶子结点至多拥有两个儿子（Left和Right）； &nb
数据库连接池 alafqq 数据库连接池
http://www.cnblogs.com/xdp-gacl/p/4002804.html @Anthor:孤傲苍狼数据库连接池用MySQLv5版本的数据库驱动没有问题，使用MySQLv6和Oracle的数据库驱动时候报如下错误： java.lang.ClassCastException: $Proxy0 cannot be cast to java.sql.Connec
java泛型百合不是茶 java泛型
泛型在Java SE 1.5之前，没有泛型的情况的下，通过对类型Object的引用来实现参数的“任意化”，任意化的缺点就是要实行强制转换，这种强制转换可能会带来不安全的隐患泛型的特点：消除强制转换确保类型安全向后兼容简单泛型的定义：泛型：就是在类中将其模糊化，在创建对象的时候再具体定义 class fan
javascript闭包[两个小测试例子] bijian1013 JavaScript JavaScript
一.程序一 <script> var name = "The Window"; var Object_a = { 　　name : "My Object", 　　getNameFunc : function(){ var that = this; 　　　　return function(){ 　　　　
探索JUnit4扩展：假设机制（Assumption） bijian1013 java Assumption JUnit 单元测试
一.假设机制（Assumption）概述理想情况下，写测试用例的开发人员可以明确的知道所有导致他们所写的测试用例不通过的地方，但是有的时候，这些导致测试用例不通过的地方并不是很容易的被发现，可能隐藏得很深，从而导致开发人员在写测试用例时很难预测到这些因素，而且往往这些因素并不是开发人员当初设计测试用例时真正目的，
【Gson四】范型POJO的反序列化 bit1129 POJO
在下面这个例子中，POJO(Data类)是一个范型类，在Tests中，指定范型类为PieceData，POJO初始化完成后，通过 String str = new Gson().toJson(data); 得到范型化的POJO序列化得到的JSON串，然后将这个JSON串反序列化为POJO import com.google.gson.Gson; import java.
【Spark八十五】Spark Streaming分析结果落地到MySQL bit1129 Stream
几点总结： 1. DStream.foreachRDD是一个Output Operation，类似于RDD的action，会触发Job的提交。DStream.foreachRDD是数据落地很常用的方法 2. 获取MySQL Connection的操作应该放在foreachRDD的参数（是一个RDD[T]=>Unit的函数类型)，这样，当foreachRDD方法在每个Worker上执行时，
NGINX + LUA实现复杂的控制 ronin47 nginx lua
安装lua_nginx_module 模块 lua_nginx_module 可以一步步的安装，也可以直接用淘宝的OpenResty Centos和debian的安装就简单了。。这里说下freebsd的安装： fetch http://www.lua.org/ftp/lua-5.1.4.tar.gz tar zxvf lua-5.1.4.tar.gz cd lua-5.1.4 ma
java-递归判断数组是否升序 bylijinnan java
public class IsAccendListRecursive { /*递归判断数组是否升序 * if a Integer array is ascending,return true * use recursion */ public static void main(String[] args){ IsAccendListRecursiv
Netty源码学习-DefaultChannelPipeline2 bylijinnan java netty
Netty3的API http://docs.jboss.org/netty/3.2/api/org/jboss/netty/channel/ChannelPipeline.html 里面提到ChannelPipeline的一个“pitfall”：如果ChannelPipeline只有一个handler（假设为handlerA）且希望用另一handler（假设为handlerB）来
Java工具之JPS chinrui java
JPS使用熟悉Linux的朋友们都知道，Linux下有一个常用的命令叫做ps（Process Status)，是用来查看Linux环境下进程信息的。同样的，在Java Virtual Machine里面也提供了类似的工具供广大Java开发人员使用，它就是jps（Java Process Status)，它可以用来
window.print分页打印 ctrain window
function init() { var tt = document.getElementById("tt"); var childNodes = tt.childNodes[0].childNodes; var level = 0; for (var i = 0; i < childNodes.length; i++) {
安装hadoop时执行jps命令Error occurred during initialization of VM daizj jdk hadoop jps
在安装hadoop时，执行JPS出现下面错误 [slave16][email protected]:/tmp/hsperfdata_hdfs# jps Error occurred during initialization of VM java.lang.Error: Properties init: Could not determine current working
PHP开发大型项目的一点经验 dcj3sjt126com PHP 重构
一、变量最好是把所有的变量存储在一个数组中，这样在程序的开发中可以带来很多的方便，特别是当程序很大的时候。变量的命名就当适合自己的习惯，不管是用拼音还是英语，至少应当有一定的意义，以便适合记忆。变量的命名尽量规范化，不要与PHP中的关键字相冲突。二、函数 PHP自带了很多函数，这给我们程序的编写带来了很多的方便。当然，在大型程序中我们往往自己要定义许多个函数，几十
android笔记之--向网络发送GET/POST请求参数 dcj3sjt126com android
使用GET方法发送请求 private static boolean sendGETRequest (String path, Map<String, String> params) throws Exception{ //发送地http://192.168.100.91:8080/videoServi
linux复习笔记之bash shell (3) 通配符 eksliang linux 通配符 linux通配符
转载请出自出处： http://eksliang.iteye.com/blog/2104387 在bash的操作环境中有一个非常有用的功能，那就是通配符。下面列出一些常用的通配符，如下表所示符号意义 * 万用字符，代表0个到无穷个任意字符 ? 万用字符，代表一定有一个任意字符 [] 代表一定有一个在中括号内的字符。例如：[abcd]代表一定有一个字符，可能是a、b、c
Android关于短信加密 gqdy365 android
关于Android短信加密功能，我初步了解的如下（只在Android应用层试验）： 1、因为Android有短信收发接口，可以调用接口完成短信收发；发送过程：APP（基于短信应用修改）接受用户输入号码、内容——>APP对短信内容加密——>调用短信发送方法Sm
asp.net在网站根目录下创建文件夹 hvt .net C#hovertree asp.net Web Forms
假设要在asp.net网站的根目录下建立文件夹hovertree,C#代码如下： string m_keleyiFolderName = Server.MapPath("/hovertree"); if (Directory.Exists(m_keleyiFolderName)) { //文件夹已经存在 return; } else { try { D
一个合格的程序员应该读过哪些书 justjavac 程序员书籍
编者按：2008年8月4日，StackOverflow 网友 Bert F 发帖提问：哪本最具影响力的书，是每个程序员都应该读的？ “如果能时光倒流，回到过去，作为一个开发人员，你可以告诉自己在职业生涯初期应该读一本，你会选择哪本书呢？我希望这个书单列表内容丰富，可以涵盖很多东西。” 很多程序员响应，他们在推荐时也写下自己的评语。以前就有国内网友介绍这个程序员书单，不过都是推荐数
单实例实践跑龙套_az 单例
1、内部类 public class Singleton { private static class SingletonHolder { public static Singleton singleton = new Singleton(); } public Singleton getRes
PO VO BEAN 理解 q137681467 VO DTO po
PO：全称是 persistant object持久对象最形象的理解就是一个PO就是数据库中的一条记录。好处是可以把一条记录作为一个对象处理，可以方便的转为其它对象。 BO：全称是 business object:业务对象主要作用是把业务逻辑封装为一个对象。这个对
战胜惰性，暗自努力金笛子努力
偶然看到一句很贴近生活的话：“别人都在你看不到的地方暗自努力，在你看得到的地方，他们也和你一样显得吊儿郎当，和你一样会抱怨，而只有你自己相信这些都是真的，最后也只有你一人继续不思进取。”很多句子总在不经意中就会戳中一部分人的软肋，我想我们每个人的周围总是有那么些表现得“吊儿郎当”的存在，是否你就真的相信他们如此不思进取，而开始放松了对自己的要求随波逐流呢？我有个朋友是搞技术的，平时嘻嘻哈哈，以
NDK/JNI二维数组多维数组传递 wenzongliang 二维数组 jni NDK
多维数组和对象数组一样处理，例如二维数组里的每个元素还是一个数组用jArray表示，直到数组变为一维的，且里面元素为基本类型，去获得一维数组指针。给大家提供个例子。已经测试通过。 Java_cn_wzl_FiveChessView_checkWin( JNIEnv* env,jobject thiz,jobjectArray qizidata) { jint i,j; int s