Strategic Attentive Writer for Learning Macro-Actions

Neil Zhu,ID Not_GOD,University AI 创始人 & Chief Scientist,致力于推进世界人工智能化进程。制定并实施 UAI 中长期增长战略和目标,带领团队快速成长为人工智能领域最专业的力量。
作为行业领导者,他和UAI一起在2014年创建了TASA(中国最早的人工智能社团), DL Center(深度学习知识中心全球价值网络),AI growth(行业智库培训)等,为中国的人工智能人才建设输送了大量的血液和养分。此外,他还参与或者举办过各类国际性的人工智能峰会和活动,产生了巨大的影响力,书写了60万字的人工智能精品技术内容,生产翻译了全球第一本深度学习入门书《神经网络与深度学习》,生产的内容被大量的专业垂直公众号和媒体转载与连载。曾经受邀为国内顶尖大学制定人工智能学习规划和教授人工智能前沿课程,均受学生和老师好评。

http://arxiv.org/pdf/1606.04695.pdf
Alexander (Sasha) Vezhnevets, Volodymyr Mnih, John Agapiou,
Simon Osindero, Alex Graves, Oriol Vinyals, Koray Kavukcuoglu
Google DeepMind

依照一种端对端的方式 Deep RNN 架构学习构建隐式的规划纯粹和环境进行交互
novel deep recurrent neural network architecture
that learns to build
implicit plans in an end-to-end manner by purely interacting with an environment
in reinforcement learning setting. The network builds an internal plan, which is
continuously updated upon observation of the next input from the environment.
It can also partition this internal representation into contiguous sub- sequences
by learning for how long the plan can be committed to – i.e. followed without
re-planing. Combining these properties, the proposed model, dubbed STRategic Attentive
Writer (STRAW) can learn high-level, temporally abstracted macro- actions
of varying lengths that are solely learnt from data without any prior information.
These macro-actions enable both structured exploration and economic computation.
We experimentally demonstrate that STRAW delivers strong improvements on
several ATARI games by employing temporally extended planning strategies (e.g.
Ms. Pacman and Frostbite). It is at the same time a general algorithm that can be
applied on any sequence data. To that end, we also show that when trained on text
prediction task, STRAW naturally predicts frequent n-grams (instead of macroactions),
demonstrating the generality of the approach.

你可能感兴趣的:(Strategic Attentive Writer for Learning Macro-Actions)