RL note(1)_Why exploratory is needed

[sutton's book section 2.2]

Why exploratory is needed:

1. greedy selection is bad in the long run, as at each time, the agent has to wait for(get stucking) the suboptimal action.

2. when true values of actions changed over time, exploration is needed to make sure one of the non-greedy actions has not changed to become better than the greedy one

3. the more rewards variations, the better performance exploratory achieves

RL note(1)_Why exploratory is needed_第1张图片

你可能感兴趣的:(RL note(1)_Why exploratory is needed)