Probabilistic policy reuse in a reinforcement learning agent

Fernández, Fernando; Veloso, Manuela

doi:10.1145/1160633.1160762

Cited by 189 publications

(201 citation statements)

References 9 publications

(8 reference statements)

Supporting

Mentioning

197

Contrasting

Unclassified

Order By: Relevance

“…The level of knowledge that can be transferred across tasks can be low, such as tuples of the form s, a, r, s ′ [6,10], value-functions [12] or policies [2]. Higher level knowledge may include rules [7,13], action subsets or shaping rewards [5].…”

Section: Transfer Learning In Rlmentioning

confidence: 99%

Transfer Learning in Multi-Agent Reinforcement Learning Domains

Boutsioukis

Partalas

Vlahavas

2012

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Transfer learning refers to the process of reusing knowledge from past tasks in order to speed up the learning procedure in new tasks. In reinforcement learning, where agents often require a considerable amount of training, transfer learning comprises a suitable solution for speeding up learning. Transfer learning methods have primarily been applied in single-agent reinforcement learning algorithms, while no prior work has addressed this issue in the case of multi-agent learning. This work proposes a novel method for transfer learning in multi-agent reinforcement learning domains. We test the proposed approach in a multiagent domain under various setups. The results demonstrate that the method helps to reduce the learning time and increase the asymptotic performance.

show abstract

Section: Transfer Learning In Rlmentioning

confidence: 99%

Transfer Learning in Multi-Agent Reinforcement Learning Domains

Boutsioukis

Partalas

Vlahavas

2012

Lecture Notes in Computer Science

View full text Add to dashboard Cite

show abstract

“…We also briefly introduce a similarity concept between policies. Lastly, we review the PRQ-Learning algorithm [8].…”

Section: Policy Reusementioning

confidence: 99%

“…An efficient solution to Policy Reuse is the PRQ-Learning algorithm [8], which automatically answers two questions: (i) which policy, from the set {Π * 1 , . .…”

Section: Domains Tasks and Mdpsmentioning

confidence: 99%

“…Policy Reuse is a technique where the learner is guided by past policies balancing among its search for the optimal policy among three choices: the exploitation of the ongoing learned policy, the exploration of new random actions, and the exploitation of past policies [8]. Policy Reuse builds upon two main contributions: (i) an exploration strategy able to probabilistically bias the exploration of the domain with a predefined past policy; (ii) a similarity metric that allows the estimation of the similarity of past policies with respect to a new one.…”

Section: Introductionmentioning

confidence: 99%

“…Policy Reuse builds upon two main contributions: (i) an exploration strategy able to probabilistically bias the exploration of the domain with a predefined past policy; (ii) a similarity metric that allows the estimation of the similarity of past policies with respect to a new one. Policy Reuse has been demonstrated in a series of complex gridbased learning tasks where the efficiency of the learner significantly improves when reusing past policies [8]. Although the grid-based tasks have been extensively used in the evaluation of reinforcement learning for robot control, there remained the question of how Policy Reuse could be applied to domains potentially more complex, such as the Keepaway domain [4].…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Probabilistic Policy Reuse for inter-task transfer learning

Fernández

Garcı́a

Veloso

2010

Robotics and Autonomous Systems

Self Cite

View full text Add to dashboard Cite

Policy Reuse is a reinforcement learning technique that efficiently learns a new policy by using past similar learned policies. The Policy Reuse learner improves its exploration by probabilistically including the exploitation of those past policies. Policy Reuse was introduced and previously demonstrated its effectiveness in problems with different reward functions in the same state and action spaces. In this article, we contribute Policy Reuse as transfer learning among different domains. We introduce extended MDPs to include domains and tasks, where domains have different state and action spaces, and task are problems with different rewards within a domain. We show how Policy Reuse can be applied among domains by defining and using a mapping between their state and action spaces. We use several domains, as versions of a simulated RoboCup Keepaway problem, where we show that Policy Reuse can be used as a mechanism of transfer learning significantly outperforming a basic policy learner.

show abstract

DRL and Emerging Topics in Wireless Networks

2023

Deep Reinforcement Learning for Wireless Communications and Networking

View full text Add to dashboard Cite

Probabilistic policy reuse in a reinforcement learning agent

Cited by 189 publications

References 9 publications

Transfer Learning in Multi-Agent Reinforcement Learning Domains

Transfer Learning in Multi-Agent Reinforcement Learning Domains

Probabilistic Policy Reuse for inter-task transfer learning

DRL and Emerging Topics in Wireless Networks

Contact Info

Product

Resources

About