Learning domain structure through probabilistic policy reuse in reinforcement learning

Fernández, Fernando; Veloso, Manuela

doi:10.1007/s13748-012-0026-6

Cited by 28 publications

(18 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Fernández et al proposed a policy selection method using probabilities in [15], [16]. In this method called PRQlearning, the reusing policy is decided based on Boltzmann distribution selection method (Eqn.…”

Section: A Probabilistic Policy Reusementioning

confidence: 99%

Activation and Spreading Sequence for Spreading Activation Policy Selection Method in Transfer Reinforcement Learning

Kono¹,

Katayama²,

Takakuwa³

et al. 2019

IJACSA

View full text Add to dashboard Cite

This paper proposes an automatic policy selection method using spreading activation theory based on psychological theory for transfer learning in reinforcement learning. Intelligent robot systems have recently been studied for practical applications such as home robot, communication robot, and warehouse robot. Learning algorithms are key to building useful robot systems important. For example, a robot can explore for optimal policy with trial and error using reinforcement learning. Moreover, transfer learning enables reuse of prior policy and is effective for environment adaptability. However, humans determine applicable methods in transfer learning. Policy selection method has been proposed for transfer learning in reinforcement learning using spreading activation model proposed in cognitive psychology. In this paper, novel activation function and spreading sequence is discussed for spreading policy selection method. Further computer simulations are used to examine the effectiveness of the proposed method for automatic policy selection in simplified shortest-path problem.

show abstract

Section: A Probabilistic Policy Reusementioning

confidence: 99%

Activation and Spreading Sequence for Spreading Activation Policy Selection Method in Transfer Reinforcement Learning

Kono¹,

Katayama²,

Takakuwa³

et al. 2019

IJACSA

View full text Add to dashboard Cite

show abstract

“…The type of knowledge that can be transferred between tasks varies among different TL methods, including value functions [8], entire policies [9], actions (policy advice) [10], or a set of samples from a source task that can be used by a model-based RL algorithm in a target task [11].…”

Section: Transfer Learning and Advising Under A Budgetmentioning

confidence: 99%

Learning to Teach Reinforcement Learning Agents

Fachantidis

Taylor

Vlahavas

2017

MAKE

View full text Add to dashboard Cite

Abstract:In this article, we study the transfer learning model of action advice under a budget. We focus on reinforcement learning teachers providing action advice to heterogeneous students playing the game of Pac-Man under a limited advice budget. First, we examine several critical factors affecting advice quality in this setting, such as the average performance of the teacher, its variance and the importance of reward discounting in advising. The experiments show that the best performers are not always the best teachers and reveal the non-trivial importance of the coefficient of variation (CV) as a statistic for choosing policies that generate advice. The CV statistic relates variance to the corresponding mean. Second, the article studies policy learning for distributing advice under a budget. Whereas most methods in the relevant literature rely on heuristics for advice distribution, we formulate the problem as a learning one and propose a novel reinforcement learning algorithm capable of learning when to advise or not. The proposed algorithm is able to advise even when it does not have knowledge of the student's intended action and needs significantly less training time compared to previous learning approaches. Finally, in this article, we argue that learning to advise under a budget is an instance of a more generic learning problem: Constrained Exploitation Reinforcement Learning.

show abstract

“…where An E [0,1] is the transfer rate [11] and y~ni is a randomly generated policy. Now we present the detailed description of the proposed learning algorithm.…”

Section: A a Policy Transfer Based Hierarchical Multi-agent Qlearninmentioning

confidence: 99%

Combined learning for resource allocation in autonomous heterogeneous cellular networks

Chen

Zhang

Chen

et al. 2013

2013 IEEE 24th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC)

View full text Add to dashboard Cite

The cross-and co-tier interference creates the challenges to facilitate the concept of heterogeneous cellular networks (HCNs) in practice. In this paper, we establish a combined learning framework to autonomously mitigate the destructive interference. The macrocell is modeled as the leader and protects itself through pricing the interference from small-cells, which are the followers in the stochastic learning process. During each epoch (an epoch consists of T time slots), the leader commits to a pricing policy by knowing the resource allocation policies of all followers, while the followers compete against each other in each time slot only with the leader's price information. In general, for any two consecutive epochs, the HCN states are highly correlated. The previous policy information can thus be leveraged to improve the learning performance. Numerical results support that the proposed study substantially protects the macroceU and at the same time, optimizes the energy efficiency in small-cells.

show abstract

Learning domain structure through probabilistic policy reuse in reinforcement learning

Cited by 28 publications

References 26 publications

Activation and Spreading Sequence for Spreading Activation Policy Selection Method in Transfer Reinforcement Learning

Activation and Spreading Sequence for Spreading Activation Policy Selection Method in Transfer Reinforcement Learning

Learning to Teach Reinforcement Learning Agents

Combined learning for resource allocation in autonomous heterogeneous cellular networks

Contact Info

Product

Resources

About