Zephyrnet Logo

Fast reinforcement learning through the composition of behaviours

Date:

Further reading  

GPE, successor features, and related approaches

Improving Generalisation for Temporal Difference Learning: The Successor Representation. Peter Dayan. Neural Computation, 1993.

Apprenticeship Learning Via Inverse Reinforcement Learning. Pieter Abbeel and Andrew Y. Ng. Proceedings of the International Conference on Machine learning (ICML), 2004.

Horde: A Scalable Real-time Architecture for Learning Knowledge from Unsupervised Sensorimotor Interaction. Richard S. Sutton, Joseph Modayil, Michael Delp, Thomas Degris, Patrick M. Pilarski, Adam White. Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2011.

Multi-timescale Nexting in a Reinforcement Learning Robot. Joseph Modayil, Adam White, Richard S. Sutton. From Animals to Animats, 2012.

Universal Value Function Approximators. Tom Schaul, Dan Horgan, Karol Gregor, David Silver. Proceedings of the International Conference on Machine learning (ICML), 2015.

Deep Successor Reinforcement Learning. Tejas D. Kulkarni, Ardavan Saeedi, Simanta Gautam, Samuel J. Gershman. arXiv, 2017.

Visual Semantic Planning Using Deep Successor Representations. Yuke Zhu, Daniel Gordon, Eric Kolve, Dieter Fox, Li Fei-Fei, Abhinav Gupta, Roozbeh Mottaghi, Ali Farhadi. Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.

Deep Reinforcement Learning with Successor Features for Navigation Across Similar Environments. Jingwei Zhang, Jost Tobias Springenberg, Joschka Boedecker, Wolfram Burgard. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.

Universal Successor Representations for Transfer Reinforcement Learning. Chen Ma, Junfeng Wen, Yoshua Bengio. arXiv, 2018.

Eigenoption Discovery through the Deep Successor Representation. Marlos C. Machado, Clemens Rosenbaum, Xiaoxiao Guo, Miao Liu, Gerald Tesauro, Murray Campbell. International Conference on Learning Representations (ICLR), 2018.

Successor Options: An Option Discovery Framework for Reinforcement Learning. Rahul Ramesh, Manan Tomar, Balaraman Ravindran. Proceedings of the  International Joint Conference on Artificial Intelligence (IJCAI), 2019.

Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning. David Janz, Jiri Hron, Przemysław Mazur, Katja Hofmann, José Miguel Hernández-Lobato, Sebastian Tschiatschek. Advances in Neural Information Processing Systems (NeurIPS), 2019.

Successor Features Combine Elements of Model-Free and Model-based Reinforcement Learning. Lucas Lehnert, Michael L. Littman. arXiv, 2019.

Count-Based Exploration with the Successor Representation. Marlos C. Machado, Marc G. Bellemare, Michael Bowling. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2020.

GPI, hierarchical RL, and related approaches

A Robust Layered Control System for a Mobile Robot. R. Brooks. IEEE Journal on Robotics and Automation, 1986.

Feudal Reinforcement Learning. Peter Dayan and Geoffrey E. Hinton. Advances in Neural Information Processing Systems (NIPS), 1992.

Action Selection Methods Using Reinforcement Learning. Mark Humphrys. PhD thesis, University of Cambridge, Cambridge, UK, 1997.

Learning to Solve Multiple Goals. Jonas Karlsson. PhD thesis, University of Rochester, Rochester, New York, 1997.

Reinforcement Learning with Hierarchies of Machines. Ronald Parr and Stuart J. Russell. Advances in Neural Information Processing Systems (NIPS), 1997.

Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning. Richard S.Sutton, DoinaPrecup, Satinder Singh. Artificial Intelligence, 1999.

Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition. T. G. Dietterich. Journal of Artificial Intelligence Research, 2000.

Multiple-Goal Reinforcement Learning with Modular Sarsa(O). Nathan Sprague and  Dana Ballard. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), 2003.

Q-decomposition for Reinforcement Learning Agents. Stuart J. Russell and Andrew Zimdars.  Proceedings of the International Conference on Machine Learning (ICML), 2003.

Compositionality of Optimal Control Laws. E. Todorov. Advances in Neural Information Processing Systems (NIPS), 2009.

Linear Bellman combination for control of character animation. M. da Silva, F. Durand, and J. Popovic. ACM Transactions on Graphics, 2009.

Hierarchy Through Composition with Multitask LMDPS. A. M. Saxe, A. C. Earle, and B. Rosman. Proceedings of the International Conference on Machine Learning (ICML), 2017.

Hybrid Reward Architecture for Reinforcement Learning. Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche, Tavian Barnes, and Jeffrey Tsang. Advances in Neural Information Processing Systems (NIPS), 2017.

Feudal Networks for Hierarchical Reinforcement Learning. Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu.  Proceedings of the International Conference on Machine Learning (ICML), 2017.

Composable Deep Reinforcement Learning for Robotic Manipulation. T. Haarnoja, V. Pong, A. Zhou, M. Dalal, P. Abbeel, and S. Levine. IEEE International Conference on Robotics and Automation (ICRA), 2018.

Composing Value Functions in Reinforcement Learning. Benjamin Van Niekerk, Steven James, Adam Earle, Benjamin Rosman. Proceedings of the International Conference on Machine Learning (ICML), 2019.

Planning in Hierarchical Reinforcement Learning: Guarantees for Using Local Policies. Tom Zahavy, Avinatan Hasidim, Haim Kaplan, Yishay Mansour. International Conference on Algorithmic Learning Theory (ALT), 2020.

GPE + GPI, transfer learning, and related approaches

Transfer of Learning by Composing Solutions of Elemental Sequential Tasks. Satinder Singh. Machine Learning, 1992.

Transfer Learning for Reinforcement Learning Domains: A Survey. Matthew E. Taylor and Peter Stone. Journal of Machine Learning Research, 2009.

Transfer in Variable-Reward Hierarchical Reinforcement Learning. Neville Mehta, Sriraam Natarajan, Prasad Tadepalli, Alan Fern. Machine Learning, 2008.

Learning and Transfer of Modulated Locomotor Controllers. Nicolas Heess, Greg Wayne, Yuval Tassa, Timothy Lillicrap, Martin Riedmiller, David Silver. arXiv, 2016.

Learning to Reinforcement Learn. Jane X. Wang, Zeb Kurth-Nelson, Dhruva Tirumala, Hubert Soyer, Joel Z. Leibo, Remi Munos, Charles Blundell, Dharshan Kumaran, Matt Botvinick. arXiv, 2016.

RL2: Fast Reinforcement Learning via Slow Reinforcement Learning. Yan Duan, John Schulman, Xi Chen, Peter L. Bartlett, Ilya Sutskever, Pieter Abbeel. arXiv, 2016.

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. Chelsea Finn, Pieter Abbeel, Sergey Levine. Proceedings of the International Conference on Machine Learning (ICML), 2017.

Successor Features for Transfer in Reinforcement Learning. André Barreto, Will Dabney, Rémi Munos, Jonathan J. Hunt, Tom Schaul, Hado van Hasselt, David Silver.  Advances in Neural Information Processing Systems (NIPS), 2017.

Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. André Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel Mankowitz, Augustin Žídek, Rémi Munos. Proceedings of the International Conference on Machine Learning (ICML), 2018.

Composing Entropic Policies Using Divergence Correction. Jonathan Hunt, André Barreto, Timothy Lillicrap, Nicolas Heess. Proceedings of the International Conference on Machine Learning (ICML), 2019.

Universal Successor Features Approximators. Diana Borsa, André Barreto, John Quan, Daniel Mankowitz, Rémi Munos, Hado van Hasselt, David Silver, Tom Schaul. International Conference on Learning Representations (ICLR), 2019.

The Option Keyboard: Combining Skills in Reinforcement Learning. André Barreto, Diana Borsa,  Shaobo Hou, Gheorghe Comanici, Eser Aygün, Philippe Hamel, Daniel Toyama, Jonathan J. Hunt, Shibl Mourad, David Silver, Doina Precup. Advances in Neural Information Processing Systems (NeurIPS), 2019.

Transfer Learning in Deep Reinforcement Learning: A Survey. Zhuangdi Zhu, Kaixiang Lin, Jiayu Zhou, arXiv, 2020.

Fast Task Inference with Variational Intrinsic Successor Features. Steven Hansen, Will Dabney, André Barreto, Tom Van de Wiele, David Warde-Farley, Volodymyr Mnih. International Conference on Learning Representations (ICLR), 2020.

Fast Reinforcement Learning with Generalized Policy Updates. André Barreto, Shaobo Hou, Diana Borsa, David Silver, Doina Precup. Proceedings of the National Academy of Sciences, 2020.

The successor representation in neuroscience

The Hippocampus as a Predictive Map. Kimberly Stachenfeld, Matthew Botvinick, Samuel Gershman. Nature Neuroscience, 2017.

The Successor Representation in Human Reinforcement Learning. I. Momennejad, E. M. Russek, J. H. Cheong, M. M. Botvinick, N. D. Daw, S. J. Gershman.  Nature Human Behaviour, 2017.

Predictive Representations Can Link Model-Based Reinforcement Learning to Model-Free Mechanisms. E. Russek, I. Momennejad, M. M. Botvinick, S. J. Gershman, N. D. Daw. PLOS Computational Biology, 2017.

The Successor Representation: Its Computational Logic and Neural Substrates. Samuel J. Gershman. Journal of Neuroscience, 2018.

Better Transfer Learning with Inferred Successor Maps. Tamas J. Madarasz, Timothy E. Behrens. Advances in Neural Information Processing Systems (NeurIPS), 2019.

Multi-Task Reinforcement Learning in Humans. Momchil S. Tomov, Eric Schulz, and Samuel J. Gershman. bioRxiv, 2019.

A neurally plausible model learns successor representations in partially observable environments. Eszter Vertes, Maneesh Sahani. Advances in Neural Information Processing Systems (NeurIPS), 2019.

Neurobiological Successor Features for Spatial Navigation. William de Cothi, Caswell Barry. Hippocampus, 2020.

Linear Reinforcement Learning: Flexible Reuse of Computation in Planning, Grid Fields, and Cognitive Control. Payam Piray, Nathaniel D. Daw. bioRxiv, 2020.

Source: https://deepmind.com/blog/article/fast-reinforcement-learning-through-the-composition-of-behaviours

spot_img

Latest Intelligence

spot_img