Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment

Zhou, Tianze; Zhang, Fubiao; Shao, Kun; Li, Kai; Huang, Wenhan; Luo, Jun; Wang, Weixun; Yang, Yaodong; Mao, Hangyu; Wang, Bin; Liu, Dong; Liu, Wulong; Hao, Jianye

doi:10.48550/arxiv.2106.00517

Cited by 3 publications

(6 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Most researches of MACA like Harati et al [13] use agent-level knowledge, which is based on different characteristics of each agent to distribute rewards. While some researchers [11] pay more attention to coordinated relationship among agents, inventing coordination knowledge transfer that has better generalization and scalability. Besides, Shao et al [40] prefer to the mechanism of self-improvement, which uses no prior information.…”

Section: Discussionmentioning

confidence: 99%

“…Shao et al [40] propose Mixing Network with Meta Policy Gradient (MNMPG), which assign proper credit to each agent using a global hierarchy with meta policy gradient. Zhou et al [11] propose a level-adaptive QTransformer (LA-QTransformer) that is a novel mixing network with a multi-head attention module to combine all coordination patterns and generate the credit assignment weights.…”

Section: Mixing Networkmentioning

confidence: 99%

“…Population invariant agent via Transformer (PIT) is designed by Zhou et al [11] to achieve coordination transfer in more general scenarios, which consists of three parts, the Property Group Module, the GRU Core, and the Adaptive Action Module.…”

Section: Population Invariant Agent Via Transformermentioning

confidence: 99%

“…(3) Mixing network. In contrast to the single-agent setting, policies about each agent are constrained by the coordination indispensable in MARL [11]. However, many algorithms in MARL ignore the coordinated relationship.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Researches advanced in multi-agent credit assignment in reinforcement learning

2022

2nd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2022)

View full text Add to dashboard Cite

The multi-agent system (MAS) has always been one of the hot tasks in the distributed computing community. While with the development of reinforcement learning, the novel Multi-agent Reinforcement Learning (MARL) has gradually attracted more researchers’ attention, which aims to solve complex real-time tasks in dynamic multi-agent environment by their interaction and has been widely used in robotics, human-computer match, automatic driving and so on. Different from simple single-agent reinforcement learning, MARL faces some challenges due to the complex relationships among agents and the most influential one is the issue of credit assignment. The credit assignment often causes a substantial impediment to reward distribution, which is because the model only generates the global rewards while the own credit of each individual agent is needed during the model training phase. How to estimate and deduce the reward for each agent becomes a key issue in MARL. According to the difference of strategies, in this paper, we present an overview of the main approaches for credit assignment in MARL from three aspects, including the value-based algorithm, policy-based algorithm and mixing network-based algorithm. Also, this paper makes performance comparisons among these algorithms in different multi-agent experimental environments and finishes basic evaluation of approaches by analyzing the results of experiments. Finally, this paper summarizes the main challenges in multi-agent credit assignment (MACA) with their related solutions, current defects of algorithms about these challenges, and prospects the possible future development direction of the MACA.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: Mixing Networkmentioning

confidence: 99%

Section: Population Invariant Agent Via Transformermentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Researches advanced in multi-agent credit assignment in reinforcement learning

2022

2nd International Conference on Artificial Intelligence, Automation, and High-Performance Computing (AIAHPC 2022)

View full text Add to dashboard Cite

show abstract

“…On the other hand, MAAC [14] introduces the attention mechanism to learn the interaction relationships, while ATOC [15] exploits attentional communication to make cooperative decisions. UPDeT [37] and PIT [38] utilize attention-based semantic alignment between the input entities and the output actions to decompose the policy. Despite the promising results achieved, these works mainly rely on dense attention and have to focus on the irrelevance entities.…”

Section: Interaction Pattern In Marlmentioning

confidence: 99%

Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

Liu¹,

Song²,

Yihe³

et al. 2022

Preprint

View full text Add to dashboard Cite

Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to over-fitting on noisy interactions between entities. In this work, we introduce a novel interactiOn Pattern disenTangling (OPT) method, to disentangle not only the joint value function into agent-wise value functions for decentralized execution, but also the entity interactions into interaction prototypes, each of which represents an underlying interaction pattern within a sub-group of the entities. OPT facilitates filtering the noisy interactions between irrelevant entities and thus significantly improves generalizability as well as interpretability. Specifically, OPT introduces a sparse disagreement mechanism to encourage sparsity and diversity among discovered interaction prototypes. Then the model selectively restructures these prototypes into a compact interaction pattern by an aggregator with learnable weights. To alleviate the training instability issue caused by partial observability, we propose to maximize the mutual information between the aggregation weights and the history behaviors of each agent. Experiments on both single-task and multi-task benchmarks demonstrate that the proposed method yields results superior to the state-of-the-art counterparts. Our code will be made publicly available.

show abstract

Designing an Intelligent Scoring System for Crediting Manufacturers and Importers of Goods in Industry 4.0

Ali,

Razaque,

Yoo

et al. 2024

Logistics

View full text Add to dashboard Cite

Background: The modern credit card system is critical, but it has not been fully examined to meet the unique financial needs of a constantly changing number of manufacturers and importers. Methods: An intelligent credit card system integrates the features of artificial intelligence and blockchain technology. The decentralized and unchangeable ledger of the Blockchain technology significantly reduces the risk of fraud while maintaining real-time transaction recording. On the other hand, the capabilities of AI-driven credit assessment algorithms enable more precise, effective, and customized credit choices that are specifically tailored to meet the unique financial profiles of manufacturers and importers. Results: Several metrics, including predictive credit risk, fraud detection, credit assessment accuracy, default rate comparison, loan approval rate comparison, and other important metrics affecting the credit card system, have been investigated to determine the effectiveness of modern credit card systems when using Blockchain technology and AI. Conclusion: The study of developing an intelligent scoring system for crediting manufacturers and importers of goods in Industry 4.0 can be enhanced by incorporating user adoption. The changing legislation and increasing security threats necessitate ongoing monitoring. Scalability difficulties can be handled by detailed planning that focuses on integration, data migration, and change management. The research may potentially increase operational efficiency in the manufacturing and importing industries.

show abstract

Cooperative Multi-Agent Transfer Learning with Level-Adaptive Credit Assignment

Cited by 3 publications

References 33 publications

Researches advanced in multi-agent credit assignment in reinforcement learning

Researches advanced in multi-agent credit assignment in reinforcement learning

Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

Designing an Intelligent Scoring System for Crediting Manufacturers and Importers of Goods in Industry 4.0

Contact Info

Product

Resources

About