【无回溯RNN训练】Training recurrent networks online without backtracking

    This prevents computing or even storing G(t) for moderately large-dimensional dynamical systems, such as recurrent neural networks.

1 The NoBackTrack algorithm

1.1 The rank-one trick: an expectation-preserving reduction

    We propose to build an approximation of G(t),The construction of an unbiased is based on the following “rank-one  trick”.

【无回溯RNN训练】Training recurrent networks online without backtracking_第1张图片

    The rank-one reductionA˜ depends, not only on the value of A, but also  on the way A is decomposed as a sum of rank-one terms. In the applications  to recurrent networks below, there is a natural such choice.

【无回溯RNN训练】Training recurrent networks online without backtracking_第2张图片
【无回溯RNN训练】Training recurrent networks online without backtracking_第3张图片
【无回溯RNN训练】Training recurrent networks online without backtracking_第4张图片
【无回溯RNN训练】Training recurrent networks online without backtracking_第5张图片
【无回溯RNN训练】Training recurrent networks online without backtracking_第6张图片
【无回溯RNN训练】Training recurrent networks online without backtracking_第7张图片

你可能感兴趣的:(【无回溯RNN训练】Training recurrent networks online without backtracking)