Lecture 14 | Deep Reinforcement Learning

Lecture 14 | Deep Reinforcement Learning_第1张图片
Lecture 14 | Deep Reinforcement Learning_第2张图片
Lecture 14 | Deep Reinforcement Learning_第3张图片
Lecture 14 | Deep Reinforcement Learning_第4张图片
Lecture 14 | Deep Reinforcement Learning_第5张图片
value iteration
Lecture 14 | Deep Reinforcement Learning_第6张图片
Lecture 14 | Deep Reinforcement Learning_第7张图片
Lecture 14 | Deep Reinforcement Learning_第8张图片
Lecture 14 | Deep Reinforcement Learning_第9张图片
Lecture 14 | Deep Reinforcement Learning_第10张图片
Lecture 14 | Deep Reinforcement Learning_第11张图片

https://math.stackexchange.com/questions/2639577/why-is-the-gradient-of-this-expectation-intractable

Lecture 14 | Deep Reinforcement Learning_第12张图片
turn a integration in high dim to a expectation problem???

Lecture 14 | Deep Reinforcement Learning_第13张图片
Lecture 14 | Deep Reinforcement Learning_第14张图片
Lecture 14 | Deep Reinforcement Learning_第15张图片
Lecture 14 | Deep Reinforcement Learning_第16张图片
Lecture 14 | Deep Reinforcement Learning_第17张图片
Lecture 14 | Deep Reinforcement Learning_第18张图片
Lecture 14 | Deep Reinforcement Learning_第19张图片
Lecture 14 | Deep Reinforcement Learning_第20张图片
Lecture 14 | Deep Reinforcement Learning_第21张图片
Lecture 14 | Deep Reinforcement Learning_第22张图片
Lecture 14 | Deep Reinforcement Learning_第23张图片

computational efficiency -> low resolution to high resolution

Lecture 14 | Deep Reinforcement Learning_第24张图片
Lecture 14 | Deep Reinforcement Learning_第25张图片

this hard attention -> a lot applications!!! -> improve efficiency

but still need RNN -> may be slow

efficiency depends on the case

high resolution input -> fast by this method

Lecture 14 | Deep Reinforcement Learning_第26张图片
Lecture 14 | Deep Reinforcement Learning_第27张图片
Q learning may be harder to tune

你可能感兴趣的:(Lecture 14 | Deep Reinforcement Learning)