Probabilistic Policy Reuse for inter-task transfer learning

Fernández, Fernando; Garcı́a, Javier; Veloso, Manuela

doi:10.1016/j.robot.2010.03.007

Cited by 61 publications

(41 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Policy Reuse has also been succesfully applied in more complex domains, as the Keepaway task in robot soccer, which requires: i) a mapping between tasks that use different state and action spaces; and ii) function approximation methods since the state space is continuous [44,14].…”

Section: Resultsmentioning

confidence: 99%

“…We assume that we are using a direct RL method to learn the action policy, so we are learning the related Q function. Any RL algorithm can be used to learn the Q function, and Sarsa(λ) and Q(λ) have been applied [13,14].…”

Section: The π-Reuse Exploration Strategymentioning

confidence: 99%

“…The challenge in Keepaway is to transfer learned knowledge from simpler (although continuous) to larger state and action spaces, e.g., from a Keepaway problem with some number of teammates and opponents to a new one with larger number of agents. The use of Policy Reuse for transfer learning among different state and action spaces (typically called inter-task transfer [11]), and its evaluation in the Keepaway can be found in the literature [12][13][14]. Variations of Policy Reuse algorithms can also be found for multi-robot reconfiguration [15] and learning from demonstration, also in the Keepaway [16].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Learning domain structure through probabilistic policy reuse in reinforcement learning

Fernández

Veloso

2012

Prog Artif Intell

Self Cite

View full text Add to dashboard Cite

Policy Reuse is a transfer learning approach to improve a reinforcement learner with guidance from previously learned similar action policies. The method uses the past policies as a probabilistic bias where the learner chooses among the exploitation of the ongoing learned policy, the exploration of random unexplored actions, and the exploitation of past policies. In this work we demonstrate that Policy Reuse further contributes to the learning of the structure of a domain. Interestingly and almost as a side effect, Policy Reuse identifies classes of similar policies revealing a basis of core-policies of the domain. We demonstrate theoretically that, under a set of conditions to be satisfied, reusing such a set of core-policies allows us to bound the minimal expected gain received while learning a new policy. In general, Policy Reuse contributes to the overall goal of lifelong reinforcement learning, as (i) it incrementally builds a policy library; (ii) it provides a mechanism to reuse past policies; and (iii) it learns an abstract domain structure in terms of corepolicies of the domain.

show abstract

Section: Resultsmentioning

confidence: 99%

Section: The π-Reuse Exploration Strategymentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Learning domain structure through probabilistic policy reuse in reinforcement learning

Fernández

Veloso

2012

Prog Artif Intell

Self Cite

View full text Add to dashboard Cite

show abstract

“…When the problem has been learned with m agents, and the next incremental problem with m+1 agents uses a state representation that sensorizes more neighbor agents, we need to use transfer learning techniques as performed for transfer learning in domains like Keepaway [5]. Specifically, a projection is used in order to get a new dataset in the new m + 1-agents problem state space included in R r .…”

Section: Multi-agent It-vqqlmentioning

confidence: 99%

Multi-agent Reinforcement Learning for Simulating Pedestrian Navigation

Fern

ndez

2012

Lecture Notes in Computer Science

Self Cite

View full text Add to dashboard Cite

In this paper we introduce a Multi-agent system that uses Reinforcement Learning (RL) techniques to learn local navigational behaviors to simulate virtual pedestrian groups. The aim of the paper is to study empirically the validity of RL to learn agent-based navigation controllers and their transfer capabilities when they are used in simulation environments with a higher number of agents than in the learned scenario. Two RL algorithms which use Vector Quantization (VQ) as the generalization method for the space state are presented. Both strategies are focused on obtaining a good vector quantizier that represents adequately the state space of the agents. We empirically state the convergence of both methods in our navigational Multi-agent learning domain. Besides, we use validation tools of pedestrian models to analyze the simulation results in the context of pedestrian dynamics. The simulations carried out, scaling up the number of agents in our environment (a closed room with a door through which the agents have to leave), have revealed that the basic characteristics of pedestrian movements have been learned.

show abstract

“…This approach is similar in spirit to that of case-based reasoning, where the similarity or reuse function is defined by the parameterized similarity metric. In later work Fernández, García, and Veloso (2010) extend the method so that it can be used in Keepaway as well. Agent and Problem Space.…”

Section: Maze Navigationmentioning

confidence: 99%

An Introduction to Intertask Transfer for Reinforcement Learning

Taylor¹,

Stone²

2011

AI Magazine

View full text Add to dashboard Cite

Transfer learning has recently gained popularity due to the development of algorithms that can successfully generalize information across multiple tasks. This article focuses on transfer in the context of reinforcement learning domains, a general learning framework where an agent acts in an environment to maximize a reward signal. The goals of this article are to (1) familiarize readers with the transfer learning problem in reinforcement learning domains, (2) explain why the problem is both interesting and difﬁcult, (3) present a selection of existing techniques that demonstrate different solutions, and (4) provide representative open problems in the hope of encouraging additional research in this exciting area.

show abstract

Probabilistic Policy Reuse for inter-task transfer learning

Cited by 61 publications

References 10 publications

Learning domain structure through probabilistic policy reuse in reinforcement learning

Learning domain structure through probabilistic policy reuse in reinforcement learning

Multi-agent Reinforcement Learning for Simulating Pedestrian Navigation

An Introduction to Intertask Transfer for Reinforcement Learning

Contact Info

Product

Resources

About