关注:决策智能与机器学习,深耕AI脱水干货
来源 | EndtoEnd.ai
作者 | DeepRL
报道 | 深度强化学习实验室
编辑 | 九三山人
【导读】今年的ICLR大会转到了线上举行,DeepMind和哈佛的研究人员投稿了一篇神经网络控制虚拟小白鼠模的论文十分亮眼。此次ICLR大会,华人学者参与论文数占比近60%,Google入选80余篇表现依旧抢眼,而国内的研究团队也不落下风,满分论文频现。本届ICLR 2020共有2594篇投稿,687 篇被接收。其中:48篇 oral 108篇,spotlights 531篇, poster 录取率为 26.5%,相比去年的 31.4% 略有降低。强化学习一直是ICLR投稿的热点,近年来强化学习及深度强化学习不断刷新着人类在游戏、棋牌等领域的最好成绩,关于谷歌研究人员用6小时完成AI芯片设计,也是采用了深度强化学习方法,强化学习的威力不容小觑。本文共列举了106篇深度强化学习领域的论文。
排名 |
1 |
平均得分 |
8 |
论文地址 |
https://openreview.net/forum?id=HJgLZR4KvH |
标题 |
Dynamics-aware Unsupervised Skill Discovery |
得分 |
8 8 8 |
Variance |
0 |
Decision |
Accept (Talk) |
排名 |
1 |
平均得分 |
8 |
论文地址 |
https://openreview.net/forum?id=H1gax6VtDB |
标题 |
Contrastive Learning Of Structured World Models |
得分 |
8 8 8 |
Variance |
0 |
Decision |
Accept (Talk) |
排名 |
1 |
平均得分 |
8 |
论文地址 |
https://openreview.net/forum?id=r1etN1rtPB |
标题 |
Implementation Matters In Deep Rl: A Case Study On Ppo And Trpo |
得分 |
8 8 8 |
Variance |
0 |
Decision |
Accept (Talk) |
排名 |
1 |
平均得分 |
8 |
论文地址 |
https://openreview.net/forum?id=HkxlcnVFwB |
标题 |
Gendice: Generalized Offline Estimation Of Stationary Values |
得分 |
8 8 8 |
Variance |
0 |
Decision |
Accept (Talk) |
排名 |
1 |
平均得分 |
8 |
论文地址 |
https://openreview.net/forum?id=S1g2skStPB |
标题 |
Causal Discovery With Reinforcement Learning |
得分 |
8 8 8 |
Variance |
0 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=r1genAVKPB |
标题 |
Is A Good Representation Sufficient For Sample Efficient Reinforcement Learning? |
得分 |
8 8 6 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=rklHqRVKvH |
标题 |
Harnessing Structures For Value-based Planning And Reinforcement Learning |
得分 |
6 8 8 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=SJgzLkBKPB |
标题 |
Explain Your Move: Understanding Agent Actions Using Focused Feature Saliency |
得分 |
6 8 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=SJeD3CEFPH |
标题 |
Meta-q-learning |
得分 |
8 8 6 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=HJl8_eHYvS |
标题 |
Discriminative Particle Filter Reinforcement Learning For Complex Partial Observations |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=rkgbYyHtwB |
标题 |
Disagreement-regularized Imitation Learning |
得分 |
6 8 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=S1glGANtDr |
标题 |
Doubly Robust Bias Reduction In Infinite Horizon Off-policy Estimation |
得分 |
6 8 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=rkgvXlrKwH |
标题 |
Seed Rl: Scalable And Efficient Deep-rl With Accelerated Central Inference |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=rJe2syrtvS |
标题 |
The Ingredients Of Real World Robotic Reinforcement Learning |
得分 |
6 8 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=BJlQtJSKDB |
标题 |
Watch The Unobserved: A Simple Approach To Parallelizing Monte Carlo Tree Search |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=ryeYpJSKwr |
标题 |
Meta-learning Acquisition Functions For Transfer Learning In Bayesian Optimization |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=ryxdEkHtPS |
标题 |
A Closer Look At Deep Policy Gradients |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=BJeAHkrYDS |
标题 |
Fast Task Inference With Variational Intrinsic Successor Features |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=rJgJDAVKvB |
标题 |
Learning To Plan In High Dimensions Via Neural Exploration-exploitation Trees |
得分 |
8 8 6 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
3 |
平均得分 |
7 |
论文地址 |
https://openreview.net/forum?id=S1lOTC4tDS |
标题 |
Dream To Control: Learning Behaviors By Latent Imagination |
得分 |
8 6 6 8 |
Variance |
1 |
Decision |
Accept (Spotlight) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=SygKyeHKDH |
标题 |
Making Efficient Use Of Demonstrations To Solve Hard Exploration Problems |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=SJleNCNtDH |
标题 |
Intrinsic Motivation For Encouraging Synergistic Behavior |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=S1xKd24twB |
标题 |
Sqil: Imitation Learning Via Reinforcement Learning With Sparse Rewards |
得分 |
8 6 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=ryxgJTEYDr |
标题 |
Reinforcement Learning With Competitive Ensembles Of Information-constrained Primitives |
得分 |
8 6 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=B1gZV1HYvS |
标题 |
Multi-agent Interactions Modeling With Correlated Policies |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=BJgy96EYvr |
标题 |
Influence-based Multi-agent Exploration |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=rylJkpEtwS |
标题 |
Learning The Arrow Of Time For Problems In Reinforcement Learning |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=Bkl7bREtDr |
标题 |
Amrl: Aggregated Memory For Reinforcement Learning |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=S1xCPJHtDB |
标题 |
Model Based Reinforcement Learning For Atari |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=r1lL4a4tDB |
标题 |
Variational Recurrent Models For Solving Partially Observable Control Tasks |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=HJlxIJBFDr |
标题 |
Sample Efficient Policy Gradient Methods With Recursive Variance Reduction |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=H1exf64KwH |
标题 |
Exploring Model-based Planning With Policy Networks |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=HygnDhEtvr |
标题 |
Reinforcement Learning Based Graph-to-sequence Model For Natural Question Generation |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=rkg-TJBFPB |
标题 |
Ride: Rewarding Impact-driven Exploration For Procedurally-generated Environments |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=ryeG924twB |
标题 |
Learning Expensive Coordination: An Event-based Deep Rl Approach |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=SJxbHkrKDH |
标题 |
Evolutionary Population Curriculum For Scaling Multi-agent Reinforcement Learning |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=S1xitgHtvS |
标题 |
Making Sense Of Reinforcement Learning And Probabilistic Inference |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=rkxDoJBYPB |
标题 |
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs |
得分 |
8 6 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=Sye57xStvB |
标题 |
Never Give Up: Learning Directed Exploration Strategies |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=HJgC60EtwB |
标题 |
Robust Reinforcement Learning For Continuous Control With Model Misspecification |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=S1l8oANFDH |
标题 |
Synthesizing Programmatic Policies That Inductively Generalize |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=r1lOgyrKDS |
标题 |
Adaptive Correlated Monte Carlo For Contextual Categorical Sequence Generation |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=S1evHerYPr |
标题 |
Improving Generalization In Meta Reinforcement Learning Using Neural Objectives |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
5 |
平均得分 |
6.33 |
论文地址 |
https://openreview.net/forum?id=rJeQoCNYDS |
标题 |
Single Episode Transfer For Differing Environmental Dynamics In Reinforcement Learning |
得分 |
3 8 8 |
Variance |
5.56 |
Decision |
Accept (Poster) |
排名 |
5 |
平均得分 |
6.33 |
论文地址 |
https://openreview.net/forum?id=H1gX8C4YPr |
标题 |
Decentralized Distributed Ppo: Mastering Pointgoal Navigation |
得分 |
3 8 8 |
Variance |
5.56 |
Decision |
Accept (Poster) |
排名 |
6 |
平均得分 |
6.25 |
论文地址 |
https://openreview.net/forum?id=SJezGp4YPr |
标题 |
Geometric Insights Into The Convergence Of Nonlinear Td Learning |
得分 |
8 3 6 8 |
Variance |
4.19 |
Decision |
Accept (Poster) |
排名 |
6 |
平均得分 |
6.25 |
论文地址 |
https://openreview.net/forum?id=BJgZGeHFPH |
标题 |
Dynamics-aware Embeddings |
得分 |
3 8 6 8 |
Variance |
4.19 |
Decision |
Accept (Poster) |
排名 |
7 |
平均得分 |
6.2 |
论文地址 |
https://openreview.net/forum?id=S1ly10EKDS |
标题 |
Reanalysis Of Variance Reduced Temporal Difference Learning |
得分 |
8 8 6 3 6 |
Variance |
3.36 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=BkglSTNFDB |
标题 |
Q-learning With Ucb Exploration Is Sample Efficient For Infinite-horizon Mdp |
得分 |
6 6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=H1e0Wp4KvH |
标题 |
Automated Curriculum Generation Through Setter-solver Interactions |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=r1xGP6VYwH |
标题 |
Optimistic Exploration Even With A Pessimistic Initialisation |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=Syx7A3NFvH |
标题 |
Multi-agent Reinforcement Learning For Networked System Control |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=BJe1334YDH |
标题 |
A Learning-based Iterative Method For Solving Vehicle Routing Problems |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=rkgpv2VFvr |
标题 |
Sharing Knowledge In Multi-task Deep Reinforcement Learning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=SJgob6NKvH |
标题 |
Rtfm: Generalising To New Environment Dynamics Via Reading |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=HkgsWxrtPB |
标题 |
Meta Reinforcement Learning With Autonomous Inference Of Subtask Dependencies |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=rke3TJrtPS |
标题 |
Projection Based Constrained Policy Optimization |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=B1x6w0EtwH |
标题 |
Graph Constrained Reinforcement Learning For Natural Language Action Spaces |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=SylOlp4FvH |
标题 |
V-mpo: On-policy Maximum A Posteriori Policy Optimization For Discrete And Continuous Control |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=SJexHkSFPS |
标题 |
Thinking While Moving: Deep Reinforcement Learning With Concurrent Control |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=rke7geHtwH |
标题 |
Keep Doing What Worked: Behavior Modelling Priors For Offline Reinforcement Learning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=Hyg-JC4FDr |
标题 |
Imitation Learning Via Off-policy Distribution Matching |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=ByxdUySKvS |
标题 |
Adversarial Autoaugment |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=B1gqipNYwH |
标题 |
Option Discovery Using Deep Skill Chaining |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=HJgLLyrYwB |
标题 |
State-only Imitation With Transition Dynamics Mismatch |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=HyxnMyBKwB |
标题 |
The Gambler’s Problem And Beyond |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=B1e-kxSKDH |
标题 |
Structured Object-aware Physics Prediction For Video Modeling And Planning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=H1lmhaVtvr |
标题 |
Dynamical Distance Learning For Semi-supervised And Unsupervised Skill Discovery |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=SkeIyaVtwB |
标题 |
Exploration In Reinforcement Learning With Deep Covering Options |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=S1lEX04tPr |
标题 |
Cm3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=ryxB2lBtvH |
标题 |
Learning To Coordinate Manipulation Skills Via Skill Behavior Diversification |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=H1ezFREtwH |
标题 |
Composing Task-agnostic Policies With Deep Reinforcement Learning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=B1gskyStwr |
标题 |
Frequency-based Search-control In Dyna |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=S1ltg1rFDS |
标题 |
Black-box Off-policy Estimation For Infinite-horizon Reinforcement Learning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=ryg48p4tPH |
标题 |
Action Semantics Network: Considering The Effects Of Actions In Multiagent Systems |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=BkxXe0Etwr |
标题 |
Caql: Continuous Action Q-learning |
得分 |
6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=SkgC6TNFvr |
标题 |
Reinforced Active Learning For Image Segmentation |
得分 |
6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=Hye1kTVFDS |
标题 |
The Variational Bandwidth Bottleneck: Stochastic Evaluation On An Information Budget |
得分 |
6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=H1gzR2VKDH |
标题 |
Hierarchical Foresight: Self-supervised Learning Of Long-horizon Tasks Via Visual Subgoal Generation |
得分 |
6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
9 |
平均得分 |
5.75 |
论文地址 |
https://openreview.net/forum?id=BJliakStvH |
标题 |
Maximum Likelihood Constraint Inference For Inverse Reinforcement Learning |
得分 |
8 6 3 6 |
Variance |
3.19 |
Decision |
Accept (Spotlight) |
排名 |
9 |
平均得分 |
5.75 |
论文地址 |
https://openreview.net/forum?id=rygfnn4twS |
标题 |
Autoq: Automated Kernel-wise Neural Network Quantization |
得分 |
6 6 8 3 |
Variance |
3.19 |
Decision |
Accept (Poster) |
排名 |
9 |
平均得分 |
5.75 |
论文地址 |
https://openreview.net/forum?id=Hkl9JlBYvr |
标题 |
Varibad: A Very Good Method For Bayes-adaptive Deep Rl Via Meta-learning |
得分 |
8 6 8 1 |
Variance |
8.19 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=SJg5J6NtDr |
标题 |
Watch, Try, Learn: Meta-learning From Demonstrations And Rewards |
得分 |
8 3 6 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=rJeINp4KwH |
标题 |
Population-guided Parallel Policy Search For Reinforcement Learning |
得分 |
6 8 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=HJgcvJBFvB |
标题 |
A Simple Randomization Technique For Generalization In Deep Reinforcement Learning |
得分 |
8 3 6 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=H1eCw3EKvH |
标题 |
On The Weaknesses Of Reinforcement Learning For Neural Machine Translation |
得分 |
8 6 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=rylrdxHFDr |
标题 |
State Alignment-based Imitation Learning |
得分 |
6 8 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=rylvYaNYDH |
标题 |
Finding And Visualizing Weaknesses Of Deep Reinforcement Learning Agents |
得分 |
8 6 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=Skln2A4YDB |
标题 |
Model-augmented Actor-critic: Backpropagating Through Paths |
得分 |
3 6 8 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=rygf-kSYwH |
标题 |
Behaviour Suite For Reinforcement Learning |
得分 |
8 3 6 |
Variance |
4.22 |
Decision |
Accept (Spotlight) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=BJluxREKDB |
标题 |
Learning Heuristics For Quantified Boolean Formulas Through Reinforcement Learning |
得分 |
6 8 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=Bkg0u3Etwr |
标题 |
Maxmin Q-learning: Controlling The Estimation Bias Of Q-learning |
得分 |
8 6 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=ryx6WgStPB |
标题 |
Hypermodels For Exploration |
得分 |
8 3 6 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
11 |
平均得分 |
5.5 |
论文地址 |
https://openreview.net/forum?id=ByeWogStDS |
标题 |
Sub-policy Adaptation For Hierarchical Reinforcement Learning |
得分 |
3 8 |
Variance |
6.25 |
Decision |
Accept (Poster) |
排名 |
11 |
平均得分 |
5.5 |
论文地址 |
https://openreview.net/forum?id=r1xPh2VtPB |
标题 |
Svqn: Sequential Variational Soft Q-learning Networks |
得分 |
3 8 |
Variance |
6.25 |
Decision |
Accept (Poster) |
排名 |
12 |
平均得分 |
5.25 |
论文地址 |
https://openreview.net/forum?id=BJeGlJStPr |
标题 |
Impact: Importance Weighted Asynchronous Architectures With Clipped Target Networks |
得分 |
6 3 6 6 |
Variance |
1.69 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=rJld3hEYvS |
标题 |
排名ing Policy Gradient |
得分 |
6 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=HklxbgBKvr |
标题 |
Model-based Reinforcement Learning For Biological Sequence Design |
得分 |
6 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=HJx-3grYDB |
标题 |
Learning Nearly Decomposable Value Functions Via Communication Minimization |
得分 |
6 6 3 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=Byx4NkrtDS |
标题 |
Implementing Inductive Bias For Different Navigation Tasks Through Diverse Rnn Attrractors |
得分 |
3 6 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=SylL0krYPS |
标题 |
Toward Evaluating Robustness Of Deep Reinforcement Learning With Continuous Control |
得分 |
6 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=rJxX8T4Kvr |
标题 |
Learning Efficient Parameter Server Synchronization Policies For Distributed Sgd |
得分 |
6 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=HkxjqxBYDB |
标题 |
Episodic Reinforcement Learning With Associative Memory |
得分 |
6 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
14 |
平均得分 |
4.67 |
论文地址 |
https://openreview.net/forum?id=rkecJ6VFvr |
标题 |
Logic And The 2-simplicial Transformer |
得分 |
8 3 3 |
Variance |
5.56 |
Decision |
Accept (Poster) |
排名 |
15 |
平均得分 |
4 |
论文地址 |
https://openreview.net/forum?id=rkl3m1BFDB |
标题 |
Exploratory Not Explanatory: Counterfactual Analysis Of Saliency Maps For Deep Rl |
得分 |
1 3 8 |
Variance |
8.67 |
Decision |
Accept (Poster) |
排名 |
15 |
平均得分 |
4 |
论文地址 |
https://openreview.net/forum?id=S1xnXRVFwH |
标题 |
Playing The Lottery With Rewards And Multiple Languages: Lottery Tickets In Rl And Nlp |
得分 |
3 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
本文同步推送至
知乎:https://zhuanlan.zhihu.com/c_1196078521515343872
Github: https://github.com/NeuronDance/DeepRL/tree/master/DRL-ConferencePaper/ICLR/2020
交流合作
请加微信号:yan_kylin_phenix,注明姓名+单位+从业方向+地点,非诚勿扰。