2018
DOI: 10.48550/arxiv.1802.08757
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents

Abstract: We consider the problem of fully decentralized multi-agent reinforcement learning (MARL), where the agents are located at the nodes of a time-varying communication network. Specifically, we assume that the reward functions of the agents might correspond to different tasks, and are only known to the corresponding agent. Moreover, each agent makes individual decisions based on both the information observed locally and the messages received from its neighbors over the network.Within this setting, the collective g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
71
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 46 publications
(72 citation statements)
references
References 71 publications
1
71
0
Order By: Relevance
“…The potential of peer-to-peer communication is highlighted by progress in decentralized control, but recent work has incorporated only a limited study of realistic propagation models, schedule functions, multi-hop behavior, and other real-world aspects of complex networks. The simplest network abstractions used in robotics studies focus on swarms as a graph structure, using a model where the nearest neighbors can communicate fully [25], [26]. Some work uses line-of-sight (LOS) instead of purely geometric distance in order to determine which agents can communicate [27], [28], but LOS is not an accurate model for RF networks.…”
Section: B Communication In Multi-agent Roboticsmentioning
confidence: 99%
“…The potential of peer-to-peer communication is highlighted by progress in decentralized control, but recent work has incorporated only a limited study of realistic propagation models, schedule functions, multi-hop behavior, and other real-world aspects of complex networks. The simplest network abstractions used in robotics studies focus on swarms as a graph structure, using a model where the nearest neighbors can communicate fully [25], [26]. Some work uses line-of-sight (LOS) instead of purely geometric distance in order to determine which agents can communicate [27], [28], but LOS is not an accurate model for RF networks.…”
Section: B Communication In Multi-agent Roboticsmentioning
confidence: 99%
“…i.e., qj N (x, a) is an approximation of the centralized Qfunction's maximum when agent j uses the control u(j) = a. We re-express (8) and let Qj N be a monotonically increasing approximation of the Q-function for agent j after N iterations given the available batch data that satisfies:…”
Section: A Derivationmentioning
confidence: 99%
“…In multi-agent reinforcement learning, agents make sequential decisions to maximize their joint or individual rewards [7], [8]. The agents can be fully cooperative, i.e., maximizing a joint reward function, fully competitive, i.e., the agents' objectives are opposed, or a combination of both [7], [8].…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations