2021
DOI: 10.1002/int.22778
|View full text |Cite
|
Sign up to set email alerts
|

ME‐MADDPG: An efficient learning‐based motion planning method for multiple agents in complex environments

Abstract: Developing efficient motion policies for multiagents is a challenge in a decentralized dynamic situation, where each agent plans its own paths without knowing the policies of the other agents involved. This paper presents an efficient learning‐based motion planning method for multiagent systems. It adopts the framework of multiagent deep deterministic policy gradient (MADDPG) to directly map partially observed information to motion commands for multiple agents. To improve the efficiency of MADDPG in sample uti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(3 citation statements)
references
References 28 publications
(40 reference statements)
0
2
0
Order By: Relevance
“…There has been a considerable extension to the MADDPG paradigm where recurrent DPGs were used for complex environments like Cognitive Electronic Warfare [18] and Partially Observable Environments for Communication systems [19]. A mixed environment approach was taken for complex environments using MADDPG [20], whereas Decomposed Approach was introduced for learning multi-agent policies for UAV clusters to build a connected communication network [21]. Further, these set of algorithms were also discussed in the context of working with smart grids for edge technology [22] and have shown to perform considerably well when compared to the other state-of-the-art.…”
Section: B Multi-agent Deep Deterministic Policy Gradients (With Prio...mentioning
confidence: 99%
“…There has been a considerable extension to the MADDPG paradigm where recurrent DPGs were used for complex environments like Cognitive Electronic Warfare [18] and Partially Observable Environments for Communication systems [19]. A mixed environment approach was taken for complex environments using MADDPG [20], whereas Decomposed Approach was introduced for learning multi-agent policies for UAV clusters to build a connected communication network [21]. Further, these set of algorithms were also discussed in the context of working with smart grids for edge technology [22] and have shown to perform considerably well when compared to the other state-of-the-art.…”
Section: B Multi-agent Deep Deterministic Policy Gradients (With Prio...mentioning
confidence: 99%
“…The CM relies on Reinforcement Learning (RL)-based methods that use iterative algorithms to converge in an optimal navigation policy [30][31][32]. It is common in MAS to use methods based on Deep Reinforcement Learning (DRL), which is a powerful tool that combines neural networks and RL algorithms that allow each agent to learn from its interactions with the environment [33][34][35][36][37][38]. Despite the effectiveness of the RL-based methods, the main disadvantage in MAS is the computational complexity and abundance of data required to converge to the global policy.…”
Section: Introductionmentioning
confidence: 99%
“…Trajectory generation methods which are rooted in PDE have the advantage that they are easy to implement, entirely interpretable, and can provide theoretical guarantees regarding optimality, robustness, and other concerns. This is to distinguish them from sampling and learning based algorithms (for example [15,16,17,18,19]) which often sacrifice interpretability for efficiency. The main drawbacks of the PDEbased methods are the lack of efficiency and scalability.…”
Section: Introductionmentioning
confidence: 99%