2020
DOI: 10.48550/arxiv.2001.03415
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multi-Agent Interactions Modeling with Correlated Policies

Abstract: In multi-agent systems, complex interacting behaviors arise due to the high correlations among agents. However, previous work on modeling multi-agent interactions from demonstrations is primarily constrained by assuming the independence among policies and their reward structures. In this paper, we cast the multiagent interactions modeling problem into a multi-agent imitation learning framework with explicit modeling of correlated policies by approximating opponents' policies, which can recover agents' policies… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 23 publications
0
4
0
Order By: Relevance
“…While previous works have delved into correlated policies through various methodologies, such as explicit modeling and recursive reasoning frameworks, our approach diverges by prioritizing the maximization of MI between actions of multiple agents. This emphasis on MI serves as a comprehensive measure of correlation, aiming to foster effective coordination among agents in MARL settings [19,20]. A promising direction is to leverage principles from information theory to design coordination strategies.…”
Section: Related Workmentioning
confidence: 99%
“…While previous works have delved into correlated policies through various methodologies, such as explicit modeling and recursive reasoning frameworks, our approach diverges by prioritizing the maximization of MI between actions of multiple agents. This emphasis on MI serves as a comprehensive measure of correlation, aiming to foster effective coordination among agents in MARL settings [19,20]. A promising direction is to leverage principles from information theory to design coordination strategies.…”
Section: Related Workmentioning
confidence: 99%
“…The standard maximum entropy MARL method learns the joint stochastic policy 𝝅 (𝒖 𝑑 | 𝒐 𝑑 ) while the individual policy πœ‹ 𝑖 (𝑒 𝑖 | π‘œ 𝑖 ) is unavailable, which violates the CTDE framework. To deal with the issue, we attempt to use the multivariate Gaussian distribution N 𝑀 to model the interaction between individual policies because the behavioral strategies reflect agents' cooperation relationship [18]. Let 𝑑 denote the dimension of the action space for each agent and Ξ£ denote the covariance matrix of the multivariate Gaussian distribution.…”
Section: Collaborative Exploration Modulementioning
confidence: 99%
“…The correlated policies are considered in several other works too. [15] proposed the explicit modeling of correlated policies for multi-agent imitation learning, and [25] proposed a probabilistic recursive reasoning framework. By introducing a latent variable and variational lower bound on mutual information, the proposed VM3-AC increases the correlation among policies without communication in the execution phase and without explicit dependency across agents' actions.…”
Section: Appendix A: Related Workmentioning
confidence: 99%
“…Such non-correlated factorization of the joint policy limits the agents to learn coordinated behavior due to negligence of the influence of other agents [25,2]. However, learning coordinated behavior is one of the fundamental problems in MARL [25,15].…”
Section: Introductionmentioning
confidence: 99%