reinforcement learning & value iteration discussion方面的奠基性文章


Value ineration:

1. Bertsekas, D. P., & Tsitsiklis, J. N. (1989). Parallel and Distributed Computation: Numerical Methods. Prentice Hall. Republished by Athena Scientific in 1997.

2. Moore, A. W., & Atkeson, C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13 (1), 103-130

3. Peng, J., & Williams, R. J. (1993). Efficient learning and planning within the Dyna framework. In Proceedings of the Second International Conference on Simulation of Adaptive Behavior, pp. 281290.

你可能感兴趣的:(reinforcement learning & value iteration discussion方面的奠基性文章)