CMU 10703: Deep Reinforcement Learning and Control, Spring 2017

Homepage

Warm up

  • First Chapters from Reinforcement Learning: an Introduction, Sutton&Barto ,Second Edition (pdf) & also ebook here
  • Dave Silver’s course and lecture videos on reinforcement learning

Schedule

 

Date Topics Lecturer Readings Additional Material
Wed Jan 18 Course Introduction Katerina    
Mon Jan 23 Intro to MDPs, POMDPs Katerina Sutton & Barto Ch 3  
Wed Jan 25 Solving known MDPs: Dynamic Programming, Value Iteration, Policy Iteration, Policy Evaluation Katerina Sutton & Barto Ch 4  
Mon Jan 30 Monte Carlo Learning: value function estimation and optimization Russ Sutton & Barto Ch 5  
Wed Feb 1 Temporal Difference Learning: value function estimation and optimization, Q learning, SARSA Russ Sutton & Barto Ch 6  
Mon Feb 6 Planning and Learning(1): Tabular methods, Dyna, Monte Carlo Tree Search Katerina Sutton & Barto Ch 8 A Survey of Monte Carlo Tree Search Methods http://www.cameronius.com/cv/mcts-survey-master.pdf
Wed Feb 8 Value function approximation, Deep Learning, Convnets, backpropagation Russ    
Mon Feb 13 Value function approximation, Deep Learning, Convnets, backpropagation Russ    
Wed Feb 15 Deep Q Learning : Double Q learning, replay memory Russ    
Mon Feb 20 Policy Gradients (1): REINFORCE, Natural Policy gradients,Variance reduction in gradient estimation, Actor-Critic, Deep Actor-Critic, TRPO Russ Sutton & Barto Ch 13  
Wed Feb 22 Policy Gradients (2) Russ    
Mon Feb 27 Policy Gradients (3) Russ    
Wed Mar 1 Closer look at Continuous Actions, Variational Autoencoders, multimodal stochastic policies Russ    
Mon Mar 6 Exploration(1) Katerina   Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models
https://arxiv.org/abs/1507.00814, Variational Information Maximizing Exploration https://arxiv.org/abs/1605.09674, visitation counts, hashing


Wed Mar 8 Imitation learning(1): mimicking experts, behaviour cloning Katerina   An Invitation to Imitation http://www.ri.cmu.edu/publication_view.html?pub_id=7891 Generative adversarial imitation learning
https://arxiv.org/abs/1606.03476
Mon Mar 13 Spring break!      
Wed Mar 15 Spring break!      
Mon Mar 20 Imitation learning(2): Learning reward functions from demonstration, IOC, IRL     A Reduction of Imitation Learning and Structured Prediction
to No-Regret Online Learning http://www.jmlr.org/proceedings/papers/v15/ross11a/ross11a.pdf, Generative adversarial imitation learning https://arxiv.org/abs/1606.03476, Maximum entropy inverse reinforcement learning http://www.aaai.org/Papers/AAAI/2008/AAAI08-227.pdf,Learning to search: Functional gradient techniques for imitation learning http://www.ri.cmu.edu/publication_view.html?pub_id=6410
Wed Mar 22 Intro to optimal control, Differential Dynamic Programming, LQR, iterative-LQR Katerina    
Mon Mar 27 Imitation learning(3): learning from optimal controllers, self trials Katerina   End-to-End Training of Deep Visuomotor Policies https://arxiv.org/pdf/1504.00702.pdf, PLATO: Policy Learning using Adaptive Trajectory Optimization, https://arxiv.org/pdf/1603.00622v3.pdf
Wed Mar 29 Planning and Learning(2): Learning Forward/Backward Models from experience, Planning with learned forward models, simulation to real world adaptation Katerina   SE3-Nets: Learning Rigid Body Motion using Deep Neural Networks
https://arxiv.org/pdf/1606.02378v2.pdf
Mon Apr 3 Planning and Learning(3)      
4 Case studies: Alpha Go, deep math Katerina    
Mon Apr 10 Modular / Hierarchical RL (1): compositionality, temporal abstraction      
Wed Apr 12 Modular / Hierarchical RL (2): Multi-task learning, curriculum learning Russ    
Mon Apr 17 Exploration(2):Learning and exploration in 3D environments, Long Term Memory Russ    
Wed Apr 19 Learning Motor Control: inspiration from Psychology   Sutton & Barto Ch 14,15  
Mon Apr 24 Frontiers/Open Problems Katerina    
Wed Apr 26 Project Presentations      
Mon May 1 Project Presentations      
Wed May 3 Project Presentations      

Log

Week 1:

Jan 18 - Introduction

  • Slide 1
    • 1/23/2017;
  • First Chapters from Reinforcement Learning: an Introduction, Sutton&Barto ,Second Edition
    • Chapter 1: 1/19/2017;
  • Lecture 1 & 2 from Dave Silver’s course and lecture videos on reinforcement learning
    • Lecture 1: 1/17/2017;

Week 2:

Jan 23 - Intro to MDPs, POMDPs

  • Slide
  • Sutton & Barto Ch 3
    • 3.1, 3.2, 3.3: 1/23/2017;

Jan 25 - Solving known MDPs: Dynamic Programming, Value Iteration, Policy Iteration, Policy Evaluation

  • Slide
  • Sutton & Barto Ch 4
    • 4.1: 1/25/2017;
  • implement Markov Decision Processes in Python
    • AIMA Python file: mdp.py (code taken from Artificial Intelligence: A Modern Approach by Stuart Russell and Peter Norvig)

 

转载于:https://www.cnblogs.com/casperwin/p/6295396.html

你可能感兴趣的:(java,python)