Collaborative Deep Reinforcement Learning

Lin, Kaixiang; Wang, Shu; Zhou, Jiayu

doi:10.48550/arxiv.1702.05796

Cited by 4 publications

(7 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hence, policy training for the CDRL agents must be optimized jointly with the source agent selection and wireless resource allocation. Given these challenges, the problem in (4a)-(4d) cannot be solved via existing CDRL methods such as knowledge distillation [11]- [14] to transfer the source agents' knowledge to the target agent. Moreover, existing CDRL methods cannot be directly applied here as they do not account for the wireless and real-time constraints of knowledge sharing among agents.…”

Section: Problem Formulationmentioning

confidence: 99%

“…where f n (θ n ) represents the loss function of the local model for agent n ∈ N . Considering policy gradient (PG) DRL algorithms [11], the loss function is defined as…”

Section: A Heterogeneous Federated Drlmentioning

confidence: 99%

“…Moreover, enabling an effective CDRL among heterogeneous agents (as expected in IoT applications) is very challenging, due to dissimilarities of agents (e.g., different action spaces), environments, and diversity of DRL tasks. In the CDRL context, the heterogeneity of environments, modeled as Markov decision processes (MDPs), as well as agents and their tasks can be expressed in two main forms: 1) distinct DRL tasks that are conceptually similar (i.e., semantically related tasks) or completely dissimilar [8], [9] and 2) distinct environments represented by different MDPs [10], [11]. Most existing works, such as in [5]- [7], study CDRL among homogeneous agents, i.e., agents with the same action space.…”

Section: Introductionmentioning

confidence: 99%

“…Most existing works, such as in [5]- [7], study CDRL among homogeneous agents, i.e., agents with the same action space. Meanwhile, the works in [8]- [11] consider more realistic scenarios by adopting knowledge transfer for CDRL among heterogeneous agents. While interesting, most of the prior art in [8]- [11] assumes that an expert agent is always available and DRL tasks are semantically similarstrong assumptions that are often inapplicable to practical IoT scenarios.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless Cellular Networks

Lotfi¹,

Semiari²,

Saad³

2021

Preprint

View full text Add to dashboard Cite

Collaborative deep reinforcement learning (CDRL) algorithms in which multiple agents can coordinate over a wireless network is a promising approach to enable future intelligent and autonomous systems that rely on real-time decision making in complex dynamic environments. Nonetheless, in practical scenarios, CDRL face many challenges due to heterogeneity of agents and their learning tasks, different environments, time constraints of the learning, and resource limitations of wireless networks. To address these challenges, in this paper, a novel semantic-aware CDRL method is proposed to enable a group of heterogeneous untrained agents with semantically-linked DRL tasks to collaborate efficiently across a resource-constrained wireless cellular network. To this end, a new heterogeneous federated DRL (HFDRL) algorithm is proposed to select the best subset of semantically relevant DRL agents for collaboration. The proposed approach then jointly optimizes the training loss and wireless bandwidth allocation for the cooperating selected agents in order to train each agent within the time limitation of its realtime task. Simulation results show the superior performance of the proposed algorithm compared to state-of-the-art baselines.

show abstract

Section: Problem Formulationmentioning

confidence: 99%

“…where f n (θ n ) represents the loss function of the local model for agent n ∈ N . Considering policy gradient (PG) DRL algorithms [11], the loss function is defined as…”

Section: A Heterogeneous Federated Drlmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless Cellular Networks

Lotfi¹,

Semiari²,

Saad³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Our framework also learns from imperfect demonstrations, but treats actions performed by a peer policy as demonstrations. Collaborative learning is also studied in [Lin et al, 2017], however, with expensive pre-trained teachers. Meta-learning methods also make use of a teacher model to improve the sample efficiency [Xu et al, 2018b;Xu et al, 2018a;Zha et al, 2019b].…”

Section: Related Workmentioning

confidence: 99%

Dual Policy Distillation

Lai

Zha

et al. 2020

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence

View full text Add to dashboard Cite

Policy distillation, which transfers a teacher policy to a student policy has achieved great success in challenging tasks of deep reinforcement learning. This teacher-student framework requires a well-trained teacher model which is computationally expensive. Moreover, the performance of the student model could be limited by the teacher model if the teacher model is not optimal. In the light of collaborative learning, we study the feasibility of involving joint intellectual efforts from diverse perspectives of student models. In this work, we introduce dual policy distillation (DPD), a student-student framework in which two learners operate on the same environment to explore different perspectives of the environment and extract knowledge from each other to enhance their learning. The key challenge in developing this dual learning framework is to identify the beneficial knowledge from the peer learner for contemporary learning-based reinforcement learning algorithms, since it is unclear whether the knowledge distilled from an imperfect and noisy peer learner would be helpful. To address the challenge, we theoretically justify that distilling knowledge from a peer learner will lead to policy improvement and propose a disadvantageous distillation strategy based on the theoretical results. The conducted experiments on several continuous control tasks show that the proposed framework achieves superior performance with a learning-based agent and function approximation without the use of expensive teacher models.

show abstract

Reinforcement learning inclusion to alter design sequence of finite element modeling

Ciklamini,

Cejnek

2024

Multiscale and Multidiscip. Model. Exp. and Des.

View full text Add to dashboard Cite

The study explores possibilities on how to approach cross-field methods, such as the design of mechanical systems via finite element modeling, with the contribution of reinforcement learning as a machine learning technique for guidance in design space. The application of the epsilon-greedy algorithm for optimizing parametric finite element model is illustrated by simulations through practical examples, namely the design of a cantilever beam and a JetVest. The results obtained clearly show that this approach can be beneficial in the field of rapid prototyping.

show abstract

Collaborative Deep Reinforcement Learning

Cited by 4 publications

References 19 publications

Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless Cellular Networks

Semantic-Aware Collaborative Deep Reinforcement Learning Over Wireless Cellular Networks

Dual Policy Distillation

Reinforcement learning inclusion to alter design sequence of finite element modeling

Contact Info

Product

Resources

About