Automated transfer in reinforcement learning

Ammar, Haitham Bou

doi:10.26481/dis.20130613hb

Cited by 18 publications

(27 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They use their approach for policy transfer, which differs from the value-transfer method proposed in this paper. Ammar et al (2014) learn the model of a source MDP and view the prediction error on a target MDP as a dissimilarity measure in the task space.…”

Section: Background and Related Workmentioning

confidence: 99%

Lipschitz Lifelong Reinforcement Learning

Lecarpentier¹,

Abel²,

Asadi³

et al. 2020

Preprint

View full text Add to dashboard Cite

We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks. We introduce a novel metric between Markov Decision Processes and establish that close MDPs have close optimal value functions. Formally, the optimal value functions are Lipschitz continuous with respect to the tasks space. These theoretical results lead us to a value transfer method for Lifelong RL, which we use to build a PAC-MDP algorithm with improved convergence rate. We illustrate the benefits of the method in Lifelong RL experiments.

show abstract

Section: Background and Related Workmentioning

confidence: 99%

Lipschitz Lifelong Reinforcement Learning

Lecarpentier¹,

Abel²,

Asadi³

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Task Selection has been studied for general transfer learning, and presents common aspects with the task selection that is part of sequencing in curriculum learning. Several approaches consider learning a mapping from source tasks to target tasks, and estimating the benefit of transferring between the tasks [22]- [24]. Nonetheless, transfer learning is usually performed between two tasks, a source and a target, and task selection methods have never been leveraged to achieve longer sequences.…”

Section: Related Workmentioning

confidence: 99%

An Optimization Framework for Task Sequencing in Curriculum Learning

Foglino

Christakou

Leonetti

2019

2019 Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)

View full text Add to dashboard Cite

Curriculum learning in reinforcement learning is used to shape exploration by presenting the agent with increasingly complex tasks. The idea of curriculum learning has been largely applied in both animal training and pedagogy. In reinforcement learning, all previous task sequencing methods have shaped exploration with the objective of reducing the time to reach a given performance level. We propose novel uses of curriculum learning, which arise from choosing different objective functions. Furthermore, we define a general optimization framework for task sequencing and evaluate the performance of popular metaheuristic search methods on several tasks. We show that curriculum learning can be successfully used to: improve the initial performance, take fewer suboptimal actions during exploration, and discover better policies.

show abstract

“…Consequently, numerous techniques have been proposed [17,30,35] to efficiently reuse the knowledge of learned tasks. A number of these [6,3,26] rely on a measure of similarity between MDPs in order to choose an appropriate source task to transfer from. However, this can be problematic, as no such universal metric exists [6], and some of the useful ones may be computationally expensive [3].…”

Section: Related Workmentioning

confidence: 99%

“…A number of these [6,3,26] rely on a measure of similarity between MDPs in order to choose an appropriate source task to transfer from. However, this can be problematic, as no such universal metric exists [6], and some of the useful ones may be computationally expensive [3]. In the present work, the similarity metric used is computationally inexpensive, and the degree of similarity between two tasks is based solely on the value function weights associated with them.…”

Section: Related Workmentioning

confidence: 99%

Self-organizing maps for storage and transfer of knowledge in reinforcement learning

Karimpanal

Bouffanais

2018

Adaptive Behavior

View full text Add to dashboard Cite

The idea of reusing or transferring information from previously learned tasks (source tasks) for the learning of new tasks (target tasks) has the potential to significantly improve the sample efficiency of a reinforcement learning agent. In this work, we describe a novel approach for reusing previously acquired knowledge by using it to guide the exploration of an agent while it learns new tasks. In order to do so, we employ a variant of the growing selforganizing map algorithm, which is trained using a measure of similarity that is defined directly in the space of the vectorized representations of the value functions. In addition to enabling transfer across tasks, the resulting map is simultaneously used to enable the efficient storage of previously acquired task knowledge in an adaptive and scalable manner. We empirically validate our approach in a simulated navigation environment, and also demonstrate its utility through simple experiments using a mobile micro-robotics platform. In addition, we demonstrate the scalability of this approach, and analytically examine its relation to the proposed network growth mechanism. Further, we briefly discuss some of the possible improvements and extensions to this approach, as well as its relevance to real world scenarios in the context of continual learning.

show abstract

Automated transfer in reinforcement learning

Cited by 18 publications

References 26 publications

Lipschitz Lifelong Reinforcement Learning

Lipschitz Lifelong Reinforcement Learning

An Optimization Framework for Task Sequencing in Curriculum Learning

Self-organizing maps for storage and transfer of knowledge in reinforcement learning

Contact Info

Product

Resources

About