2021
DOI: 10.48550/arxiv.2106.00517
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment

Abstract: Extending transfer learning to cooperative multi-agent reinforcement learning (MARL) has recently received much attention. In contrast to the single-agent setting, the coordination indispensable in cooperative MARL constrains each agent's policy. However, existing transfer methods focus exclusively on agent policy and ignores coordination knowledge. We propose a new architecture that realizes robust coordination knowledge transfer through appropriate decomposition of the overall coordination into several coord… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(6 citation statements)
references
References 33 publications
0
6
0
Order By: Relevance
“…Most researches of MACA like Harati et al [13] use agent-level knowledge, which is based on different characteristics of each agent to distribute rewards. While some researchers [11] pay more attention to coordinated relationship among agents, inventing coordination knowledge transfer that has better generalization and scalability. Besides, Shao et al [40] prefer to the mechanism of self-improvement, which uses no prior information.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Most researches of MACA like Harati et al [13] use agent-level knowledge, which is based on different characteristics of each agent to distribute rewards. While some researchers [11] pay more attention to coordinated relationship among agents, inventing coordination knowledge transfer that has better generalization and scalability. Besides, Shao et al [40] prefer to the mechanism of self-improvement, which uses no prior information.…”
Section: Discussionmentioning
confidence: 99%
“…Shao et al [40] propose Mixing Network with Meta Policy Gradient (MNMPG), which assign proper credit to each agent using a global hierarchy with meta policy gradient. Zhou et al [11] propose a level-adaptive QTransformer (LA-QTransformer) that is a novel mixing network with a multi-head attention module to combine all coordination patterns and generate the credit assignment weights.…”
Section: Mixing Networkmentioning
confidence: 99%
See 2 more Smart Citations
“…On the other hand, MAAC [14] introduces the attention mechanism to learn the interaction relationships, while ATOC [15] exploits attentional communication to make cooperative decisions. UPDeT [37] and PIT [38] utilize attention-based semantic alignment between the input entities and the output actions to decompose the policy. Despite the promising results achieved, these works mainly rely on dense attention and have to focus on the irrelevance entities.…”
Section: Interaction Pattern In Marlmentioning
confidence: 99%