Artificial Intelligence and Machine Learning for Multi-Domain Operations Applications III 2021
DOI: 10.1117/12.2585808
|View full text |Cite
|
Sign up to set email alerts
|

Survey of recent multi-agent reinforcement learning algorithms utilizing centralized training

Abstract: Much work has been dedicated to the exploration of Multi-Agent Reinforcement Learning (MARL) paradigms implementing a centralized learning with decentralized execution (CLDE) approach to achieve human-like collab-oration in cooperative tasks. Here, we discuss variations of centralized training and describe a recent survey of algorithmic approaches. The goal is to explore how different implementations of information sharing mechanism in centralized learning may give rise to distinct group coordinated behaviors … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 24 publications
(17 citation statements)
references
References 29 publications
(22 reference statements)
0
17
0
Order By: Relevance
“…This helps to achieve high-level information about system dynamics without getting trapped in the difficulties of coordinating information flow between multiple learners. Centralized learning and training can be integrated with either a central decision maker or decentralized excitation [184]. In the second category, the centralized learner learns the value function using the criteria for guiding distributed actors [185].…”
Section: Learning Mechanismmentioning
confidence: 99%
“…This helps to achieve high-level information about system dynamics without getting trapped in the difficulties of coordinating information flow between multiple learners. Centralized learning and training can be integrated with either a central decision maker or decentralized excitation [184]. In the second category, the centralized learner learns the value function using the criteria for guiding distributed actors [185].…”
Section: Learning Mechanismmentioning
confidence: 99%
“…Information fusion is a widely studied topic in robotics for decades [5]. As opposed to the common paradigm of controlling multiple robots using a centralised controller [6] or fusing data from various sources [7], our objective is to gather data from individually operating robots each with a different skill to train a single robot that possesses diverse skills appropriately learned from all other robots by fusing knowledge. Roboticists have attempted to perform knowledge fusion at the perception stage or decision-making stage of the robot autonomy stack.…”
Section: Related Workmentioning
confidence: 99%
“…In this subsection, we propose a distributed ZOO algorithm with asynchronous sample and update schemes based on the BCD algorithm (12) and the gradient approximation for each agent i. According to (16), we have the following approximation for each agent i at step k:…”
Section: Distributed Zoo Algorithm With Asynchronous Samplingsmentioning
confidence: 99%
“…Multi-agent networks are one of the most representative systems that have broad applications and usually induce largesize optimization problems [12]. In recent years, distributed zeroth-order convex and non-convex optimizations on multi-agent networks have been extensively studied, e.g., [13]- [17], all of which decompose the original cost function into multiple functions and assign them to the agents.…”
Section: Introductionmentioning
confidence: 99%