Deep Implicit Coordination Graphs for Multi-agent Reinforcement Learning

Li, Sheng; Gupta, Jayesh K.; Morales, Peter; Allen, Ross; Kochenderfer, Mykel J.

doi:10.48550/arxiv.2006.11438

Cited by 6 publications

(10 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Coordinating graph formulation is one of the methods for determining the joint action between agents based on the structure of interactions. In [55], Deep Implicit Coordination Graph (DICG) architecture is introduced, which includes two modules, one for obtaining the dynamic coordination graph structure and the other for learning the implicit reasoning about common actions or values. DICG uses the actor-critic structure to improve coordination for multi-agent situations.…”

Section: Interaction Methods Between Multi-agents With Gnn Architecturementioning

confidence: 99%

“…In this approach, the agents have access to the state and complete information during the training step, but in some environments, the learned policy must be applied in a decentralized manner, and the agents cannot access the full state in the execution phase. In this method, the purpose of each agent is to perform actions that maximize their utility function (joint value function), but such decentralization can result in sub-optimal actions [55].…”

Section: Different Methods For Computing Value Function In Marlmentioning

confidence: 99%

See 1 more Smart Citation

Graph Neural Networks and Reinforcement Learning: A Survey

Fathinezhad¹,

Adibi²,

Shoushtarian³

et al. 2023

Deep Learning and Reinforcement Learning

View full text Add to dashboard Cite

Graph neural network (GNN) is an emerging field of research that tries to generalize deep learning architectures to work with non-Euclidean data. Nowadays, combining deep reinforcement learning (DRL) with GNN for graph-structured problems, especially in multi-agent environments, is a powerful technique in modern deep learning. From the computational point of view, multi-agent environments are inherently complex, because future rewards depend on the joint actions of multiple agents. This chapter tries to examine different types of applying GNN and DRL techniques in the most common representations of multi-agent problems and their challenges. In general, the fusion of GNN and DRL can be addressed from two different points of view. First, GNN is used to influence the DRL performance and improve its formulation. Here, GNN is applied in relational DRL structures such as multi-agent and multi-task DRL. Second, DRL is used to improve the application of GNN. From this viewpoint, DRL can be used for a variety of purposes including neural architecture search and improving the explanatory power of GNN predictions.

show abstract

Section: Interaction Methods Between Multi-agents With Gnn Architecturementioning

confidence: 99%

Section: Different Methods For Computing Value Function In Marlmentioning

confidence: 99%

Graph Neural Networks and Reinforcement Learning: A Survey

Fathinezhad¹,

Adibi²,

Shoushtarian³

et al. 2023

Deep Learning and Reinforcement Learning

View full text Add to dashboard Cite

show abstract

“…Coordination graphs are classical technique for planning in multi-agent systems (Guestrin et al, 2001;2002b). They are combined with multi-agent deep reinforcement learning by recent work (Castellini et al, 2019;Böhmer et al, 2020;Li et al, 2020;Wang et al, 2021b). Joint action selection on coordination graphs can be modeled as a decentralized constraint optimization problem (DCOP), and previous methods compute approximate solutions by message passing among agents (Pearl, 1988).…”

Section: Related Workmentioning

confidence: 99%

Self-Organized Polynomial-Time Coordination Graphs

Yang¹,

Dong²,

Ren³

et al. 2021

Preprint

View full text Add to dashboard Cite

Coordination graph is a promising approach to model agent collaboration in multiagent reinforcement learning. It factorizes a large multi-agent system into a suite of overlapping groups that represent the underlying coordination dependencies. One critical challenge in this paradigm is the complexity of computing maximumvalue actions for a graph-based value factorization. It refers to the decentralized constraint optimization problem (DCOP), which and whose constant-ratio approximation are NP-hard problems. To bypass this fundamental hardness, this paper proposes a novel method, named Self-Organized Polynomial-time Coordination Graphs (SOP-CG), which uses structured graph classes to guarantee the optimality of the induced DCOPs with sufficient function expressiveness. We extend the graph topology to be state-dependent, formulate the graph selection as an imaginary agent, and finally derive an end-to-end learning paradigm from the unified Bellman optimality equation. In experiments, we show that our approach learns interpretable graph topologies, induces effective coordination, and improves performance across a variety of cooperative multi-agent tasks.

show abstract

“…To define the edge weights of the graph G, i.e., {w uv } (u,v)∈E , we use an attention mechanism similar to the one proposed in [34]. In particular, the agent observations at each time step are first encoded using a shared encoder mechanism φ : Z → R F to an F -dimensional embedding.…”

Section: Attention-based Edge Weightsmentioning

confidence: 99%

“…We use the exponential linear unit (ELU) as the non-linearity and set F = 128. Moreover, for the mixing GNN, we use a graph convolutional network (GCN) [12] as also used in [34]. Since the attention mechanism described in Section 4.1 leads to an effective outgoing degree of one for each graph node, the combining operation in (5) can be simplified as where σ(•) denotes a non-linearity, and…”

Section: Map Namementioning

confidence: 99%

Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning

Naderializadeh,

Hung,

Soleyman

et al. 2020

Preprint

View full text Add to dashboard Cite

We propose a novel framework for value function factorization in multi-agent deep reinforcement learning using graph neural networks (GNNs). In particular, we consider the team of agents as the set of nodes of a complete directed graph, whose edge weights are governed by an attention mechanism. Building upon this underlying graph, we introduce a mixing GNN module, which is responsible for two tasks: i) factorizing the team state-action value function into individual per-agent observationaction value functions, and ii) explicit credit assignment to each agent in terms of fractions of the global team reward. Our approach, which we call GraphMIX, follows the centralized training and decentralized execution paradigm, enabling the agents to make their decisions independently once training is completed. Experimental results on the StarCraft II multi-agent challenge (SMAC) environment demonstrate the superiority of our proposed approach as compared to the state-of-the-art.

show abstract

Deep Implicit Coordination Graphs for Multi-agent Reinforcement Learning

Cited by 6 publications

References 37 publications

Graph Neural Networks and Reinforcement Learning: A Survey

Graph Neural Networks and Reinforcement Learning: A Survey

Self-Organized Polynomial-Time Coordination Graphs

Graph Convolutional Value Decomposition in Multi-Agent Reinforcement Learning

Contact Info

Product

Resources

About