强化学习资源列表

人工智能是21世纪最激动人心的技术之一。人工智能,目的是创造像人一样的智能,而人的智能包括感知、决策和认知(从直觉到推理、规划、意识等)。其中,感知解决what,深度学习已经超越人类水平;决策解决how,强化学习在游戏和机器人等领域取得了一定效果;认知解决why,知识图谱、因果推理和持续学习等正在研究。强化学习,采用反馈学习的方式解决序列决策问题,因此必然是通往通用人工智能的终极钥匙。

课程和视频

Reinforcement Learning by David Silver (2015) [homepage] [youtube] [bilibili]

CS 188: Introduction to Artificial Intelligence [Fall 2012-Spring 2014] [Fall 2018] [Summer 2019] [Spring 2020]

CS 294: Deep Reinforcement Learning by Sergey Levine [Fall 2015] [Spring 2017] [Fall 2017] [Fall 2018]

CS 285: Deep Reinforcement Learning [Fall 2019] [youtube]

Advanced Deep Learning & Reinforcement Learning by DeepMind & UCL [youtube2018]

Deep Reinforcement Learning and Control [Spring 2017]

CS234: Reinforcement Learning [Winter 2019] [youtube]

Deep RL Bootcamp [August 2017]

Deep Reinforcement Learning by 李宏毅 [Spring 2018] [yourube2018]

Reinforcement Learning by 莫烦 [homepage]

书籍

Reinforcement Learning: An Introduction (1st Edition, 1998) [homepage]

Reinforcement Learning: An Introduction (2nd Edition, 2018) [homepage] [bookdraft2018jan1] [2018] [Python Code] [中文翻译]

Sutton配套教材练习题解答:
(1) [LyWangPX/Reinforcement-Learning-2nd-Edition-by-Sutton-Exercise-Solutions]
(2) [JKCooper2/rlai-exercises]

Hands-On Reinforcement Learning With Python (2018) [homepage]

Reinforcement Learning With Open AI TensorFlow and Keras Using Python (2018) [homepage]

Algorithms for Reinforcement Learning (2010) [download]

《神经网络与深度学习》[download]

代码

ShangtongZhang/Python Implementation of Reinforcement Learning: An Introduction (2nd Edition) [github]

JuliaReinforcementLearning/ReinforcementLearningAnIntroduction.jl [github]

berkeleydeeprlcourse [github]

tensorlayer/RLzoo [github]

rlcode/reinforcement-learning [github]

MorvanZhou/Reinforcement-learning-with-tensorflow [github]

dennybritz/reinforcement-learning [github]

p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch [github]

教程

OpenAI Spinning Up [英文版] [中文版]

演讲

Rich Sutton, 2015, Introduction to Reinforcement Learning with Function Approximation

Andrew Barto, 2018, A history of reinforcement learning

David Silver, Principles of Deep RL

Benjamin Recht, 2018, Optimization Perspectives on Learning to Control

John Schulman, 2017, The Nuts and Bolts of Deep Reinforcement Learning Research

Joelle Pineau, Introduction to Reinforcement Learning

Deep Learning and Reinforcement Learning Summer School, 2018, 2017

Deep Learning Summer School, 2016, 2015

Yisong Yue and Hoang M. Le, Imitation Learning, ICML 2018 Tutorial

综述

Li, Y. (2017). Deep Reinforcement Learning: An Overview. ArXiv. [paper]

Littman, M. L. (2015). Reinforcement learning improves behaviour from evaluative feedback. Nature, 521:445–451. [paper]

Kaelbling, L., Littman, M., and Moore, A. (1996). Reinforcement learning: A survey. Journalof Artificial Intelligence Research, 4:237–285. [paper]

算法

(1) Reinforcement Learning

  • Q-learning
    Learning From Delayed Reward (Watkins et al. 1989) [paper]
  • REINFORCE
    Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning (Williams et al. 1992) [paper] [ML]
  • SARSA
    Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding (Sutton et al. 1996) [paper] [NIPS]

(2) Deep Reinforcement Learning

  • DQN
    Playing Atari with Deep Reinforcement Learning (Mnih et al. 2013) [arxiv]
  • DDQN
    Deep Reinforcement Learning with Double Q-learning (Hasselt et al. 2015) [arxiv] [AAAI]
  • TRPO
    Trust Region Policy Optimization (Schulman et al. 2015) [arxiv] [ICML]
  • H-DQN
    Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation (Kulkarni et al. 2016) [arxiv] [NIPS]
  • PER
    Prioritized Experience Replay (Schaul et al. 2016) [arxiv] [ICLR]
  • Dueling DDQN
    Dueling Network Architectures for Deep Reinforcement Learning (Wang et al. 2016) [arxiv] [ICML]
  • DDPG
    Continuous Control With Deep Reiforcement Learning (Lillicrap et al. 2016) [arxiv] [ICLR]
  • A2C/A3C
    Asynchronous Methods for Deep Reinforcement Learning (Mnih et al. 2016) [arxiv] [ICML]
  • SNN-HRL
    Stochastic Neural Networks For Hierarchical Reinforcement learning (Florensa et al. 2017) [arxiv] [ICLR]
  • PPO
    Proximal Policy Optimization Algorithms (Schulman et al. 2017) [arxiv]
  • HER
    Hindsight Experience Replay (Andrychowicz et al. 2018) [arxiv] [NIPS]
  • TD3
    Addressing Function Approximation Error in Actor-Critic Methods (Fujimoto et al. 2018) [arxiv] [ICML]
  • DIAYN
    Diversity is All You Need: Learning Skills Without a Reward Function (Eyensbach et al. 2018) [arxiv] [ICLR]
  • HIRO
    Data-Efficient Hierarchical Reinforcement Learning (Nachum et al. 2018) arxiv [NIPS]
  • SAC
    Soft Actor-Critic Algorithms and Applications (Haarnoja et al. 2019) [arxiv]
  • SAC-Discrete
    Soft Actor-Critic For Discrete Action Settings (Christodoulou 2019) [arxiv]

环境

Cart Pole
Mountain Car
OpenAI Gym
Google Dopamine 2.0
Emo Todorov Mujoco
通用格子世界环境类

框架

OpenAI Baselines
百度 PARL
DeepMind OpenSpiel

研究员

Richard S. Sutton [homepage]
David Silver [homepage]
Pieter Abbeel [homepage]
Sergey Levine [homepage]
李宏毅 [homepage]

会议/期刊

会议:AAAI、NIPS、ICML、ICLR、IJCAI、 AAMAS、IROS等。

期刊:AI、 JMLR、JAIR、 Machine Learning、JAAMAS等。

研究机构

OpenAI
DeepMind
Berkeley Artificial Intelligence Research (BAIR) Lab

博客

Keavnn’Blog
Medium : Reinforcement Learning
StackOverflow : Reinforcement Learning

知乎

强化学习知识大讲堂
智能单元
强化学习

公众号

深度强化学习实验室
深度学习技术前沿
AI科技评论
新智元

其他

kmario23/deep-learning-drizzle [github] [webpage]

Mr.Jk.Zhang [CSDN]

你可能感兴趣的:(强化学习及深度强化学习)