reinforcement learning & value iteration discussion方面的奠基性文章

RL:

http://cdn.preterhuman.net/texts/science_and_technology/artificial_intelligence/Reinforcement%20Learning%20%20An%20Introduction%20-%20Richard%20S.%20Sutton%20,%20Andrew%20G.%20Barto.pdf

Value ineration:

1. Bertsekas, D. P., & Tsitsiklis, J. N. (1989). Parallel and Distributed Computation: Numerical Methods. Prentice Hall. Republished by Athena Scientific in 1997.

2. Moore, A. W., & Atkeson, C. G. (1993). Prioritized sweeping: Reinforcement learning with less data and less time. Machine Learning, 13 (1), 103-130

3. Peng, J., & Williams, R. J. (1993). Efficient learning and planning within the Dyna framework. In Proceedings of the Second International Conference on Simulation of Adaptive Behavior, pp. 281290.

你可能感兴趣的:(reinforcement learning & value iteration discussion方面的奠基性文章)