Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning

Castellini, Jacopo; Oliehoek, Frans A.; Savani, Rahul; Whiteson, Shimon

doi:10.1007/s10458-021-09506-w

Cited by 14 publications

(31 citation statements)

References 18 publications

(26 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…If the employed value function does not have the representational capacity to distinguish the values of coordinated and uncoordinated actions, an optimal policy cannot be learned. However, Castellini et al (2019) show that higher-order factorization of the value function works surprisingly well in one-shot games that are vulnerable to relative overgeneralization, even if each factor depends on the actions of only a small subset of agents. Such a higher-order factorization can be expressed as an undirected coordination graph (CG, Guestrin et al, 2002a), where each vertex represents one agent and each (hyper-)edge one payoff function over the joint action space of the connected agents.…”

Section: Introductionmentioning

confidence: 98%

“…Sparse cooperative Q-learning (Kok & Vlassis, 2006) applies CGs to MARL but does not scale to modern benchmarks, as each payoff function (f 12 and f 23 in Figure 1b) is represented as a table over the state and joint action space of the connected agents. Castellini et al (2019) use neural networks to approximate payoff functions, but only in one-shot games, and still require a unique function for each edge in the CG. Consequently, each Sunehag et al, 2018) corresponds to an unconnected CG.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Deep Coordination Graphs

Böhmer,

Kurin,

Whiteson

2019

Preprint

Self Cite

View full text Add to dashboard Cite

This paper introduces the deep coordination graph (DCG) for collaborative multiagent reinforcement learning. DCG strikes a flexible trade-off between representational capacity and generalization by factorizing the joint value function of all agents according to a coordination graph into payoffs between pairs of agents. The value can be maximized by local message passing along the graph, which allows training of the value function end-to-end with Q-learning. Payoff functions are approximated with deep neural networks and parameter sharing improves generalization over the state-action space. We show that DCG can solve challenging predator-prey tasks that are vulnerable to the relative overgeneralization pathology and in which all other known value factorization approaches fail.

show abstract

Section: Introductionmentioning

confidence: 98%

Section: Introductionmentioning

confidence: 99%

Deep Coordination Graphs

Böhmer,

Kurin,

Whiteson

2019

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Importantly, we show that carefully learned sparse graphs can significantly outperform complete graphs. We expect these observations to be a good supplementation to Castellini et al [23] and can eliminate any possible misunderstanding about sparse coordination graphs. Stability and Recommendation Table 3 in Appendix A shows the stability of each implementation on all tasks across the MACO benchmark.…”

Section: Which Methods Is Better For Learning Dynamically Sparse Coor...mentioning

confidence: 60%

“…Again, we observe a performance gap between complete coordination graphs and most implementations of context-aware sparse graphs. Castellini et al [23] finds that (randomly) sparse coordination graphs perform much worse than full graphs. This is aligned with our experimental results.…”

Section: Which Methods Is Better For Learning Dynamically Sparse Coor...mentioning

confidence: 99%

“…Zhang and Lesser [22] propose to learn minimized dynamic coordination sets for each agent, but the computational complexity grows exponentially with the neighborhood size of an agent. Recently, Castellini et al [23] study the representational capability of several sparse graphs but focus on random topologies and stateless games. In this paper, we push these previous works further by proposing a novel deep method that learns context-aware sparse coordination graphs adaptive to the dynamic coordination requirements of agents.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Context-Aware Sparse Deep Coordination Graphs

Wang¹,

Zeng

Dong

et al. 2021

Preprint

View full text Add to dashboard Cite

Learning sparse coordination graphs adaptive to the coordination dynamics among agents is a long-standing problem in cooperative multi-agent learning. This paper studies this problem by proposing several value-based and observation-based schemes for learning dynamic topologies and evaluating them on a new Multi-Agent COordination (MACO) benchmark. The benchmark collects classic coordination problems in the literature, increases their difficulty, and classifies them into different types. By analyzing the individual advantages of each learning scheme on each type of problem and their overall performance, we propose a novel method using the variance of utility difference functions to learn context-aware sparse coordination topologies. Moreover, our method learns action representations that effectively reduce the influence of utility functions' estimation errors on graph construction. Experiments show that our method significantly outperforms dense and static topologies across the MACO and StarCraft II micromanagement benchmark. * Equal contribution.Preprint. Under review.

show abstract

Complex Task Assignment of Aviation Emergency Rescue Based on Multiagent Reinforcement Learning

Shen¹,

Wang²

2022

City, Society, and Digital Transformation

View full text Add to dashboard Cite

Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning

Cited by 14 publications

References 18 publications

Deep Coordination Graphs

Deep Coordination Graphs

Context-Aware Sparse Deep Coordination Graphs

Complex Task Assignment of Aviation Emergency Rescue Based on Multiagent Reinforcement Learning

Contact Info

Product

Resources

About