2021
DOI: 10.1137/20m1360700
|View full text |Cite
|
Sign up to set email alerts
|

Mean-Field Controls with Q-Learning for Cooperative MARL: Convergence and Complexity Analysis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 27 publications
(33 citation statements)
references
References 23 publications
0
32
1
Order By: Relevance
“…As a result of non-uniform interaction, the so-called mean-field effect of the population on an agent is determined by the identity of the agent. This is in stark contrast with other existing works (Gu et al, 2021;Mondal et al, 2021) where the presumption of exchangeability washes away the dependence on identity. We demonstrate that, if the reward of each agent is an affine function of the mean-field distribution 'seen' by that agent, then the standard MFC approach can approximate the non-uniform MARL with an error bound of e O( 1 We would like to emphasize the importance of this result.…”
Section: Contributionscontrasting
confidence: 88%
See 2 more Smart Citations
“…As a result of non-uniform interaction, the so-called mean-field effect of the population on an agent is determined by the identity of the agent. This is in stark contrast with other existing works (Gu et al, 2021;Mondal et al, 2021) where the presumption of exchangeability washes away the dependence on identity. We demonstrate that, if the reward of each agent is an affine function of the mean-field distribution 'seen' by that agent, then the standard MFC approach can approximate the non-uniform MARL with an error bound of e O( 1 We would like to emphasize the importance of this result.…”
Section: Contributionscontrasting
confidence: 88%
“…MFC as an Approximation to Uniform MARL: Recently, MFC is gaining traction as a scalable approximate solution to uniform MARL. On the theory side, recently it has been proven that MFC can approximate uniform MARL within an error of O(1/ √ N ) (Gu et al, 2021). However, the result relies on the assumption that all agents are homogeneous.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…As already mentioned, the update rule (8.4) is provided here simply for the purpose of explaining the basic idea of Q-learning in a mean-field setup. We refer to [72,58,80,81,99] for more details on mean-field MDPs and Q-learning for mean field control. Other methods have also been investigated, such as policy gradient, see [110], which can be proved to converge with a linear rate for linearquadratic MFC problems, see [57].…”
Section: A Glance At Model-free Methodsmentioning
confidence: 99%
“…Another related line of research including [11,24,25] establishes a dynamic programming principle for mean-field control problems (without regularization), where the problem is formulated as an MDP on the space of measures, and a Q-learning algorithm is designed for learning the optimal solution.…”
Section: Introductionmentioning
confidence: 99%