【书籍阅读 Ch4】Reinforcement Learning An Introduction, 2nd Edition
Chapter4:DynamicProgramming回顾与进入4.1PolicyEvaluationRPage:74式子4.5下伪代码对应Figure4.14.2PolicyImprovement伪代码对应4.3PolicyIterationExample4.2car_rental4.4ValueIteration伪代码对应AllExercisePart前言:第1、2章点此进入;第3章点此进入注