Mean-Field Controls with Q-Learning for Cooperative MARL: Convergence and Complexity Analysis

Gu, Haotian; Guo, Xin; Wei, Xiaoli; Xu, Renyuan

doi:10.1137/20m1360700

Cited by 27 publications

(33 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a result of non-uniform interaction, the so-called mean-field effect of the population on an agent is determined by the identity of the agent. This is in stark contrast with other existing works (Gu et al, 2021;Mondal et al, 2021) where the presumption of exchangeability washes away the dependence on identity. We demonstrate that, if the reward of each agent is an affine function of the mean-field distribution 'seen' by that agent, then the standard MFC approach can approximate the non-uniform MARL with an error bound of e O( 1 We would like to emphasize the importance of this result.…”

Section: Contributionscontrasting

confidence: 88%

“…MFC as an Approximation to Uniform MARL: Recently, MFC is gaining traction as a scalable approximate solution to uniform MARL. On the theory side, recently it has been proven that MFC can approximate uniform MARL within an error of O(1/ √ N ) (Gu et al, 2021). However, the result relies on the assumption that all agents are homogeneous.…”

Section: Related Workmentioning

confidence: 99%

“…It is grounded on the idea that in an infinite population of homogeneous agents, it is sufficient to study the behaviour of only one representative agent in order to draw accurate conclusions about the whole population. Recent studies have shown that, if the agents are exchangeable, then MFC can be proven to be a good approximation of MARL (Gu et al, 2021).…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Can Mean Field Control (MFC) Approximate Cooperative Multi Agent Reinforcement Learning (MARL) with Non-Uniform Interaction?

Mondal¹,

Aggarwal²,

Ukkusuri³

2022

Preprint

View full text Add to dashboard Cite

Mean-Field Control (MFC) is a powerful tool to solve Multi-Agent Reinforcement Learning (MARL) problems. Recent studies have shown that MFC can well-approximate MARL when the population size is large and the agents are exchangeable. Unfortunately, the presumption of exchangeability implies that all agents uniformly interact with one another which is not true in many practical scenarios. In this article, we relax the assumption of exchangeability and model the interaction between agents via an arbitrary doubly stochastic matrix. As a result, in our framework, the mean-field 'seen' by different agents are different. We prove that, if the reward of each agent is an affine function of the mean-field seen by that agent, then one can approximate such a non-uniform MARL problem via its associated MFC problem within an error of e = O( 1where N is the population size and |X |, |U| are the sizes of state and action spaces respectively. Finally, we develop a Natural Policy Gradient (NPG) algorithm that can provide a solution to the non-uniform MARL with an error O(max{e, ǫ}) and a sample complexity of O(ǫ −3 ) for any ǫ > 0.

show abstract

Section: Contributionscontrasting

confidence: 88%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Can Mean Field Control (MFC) Approximate Cooperative Multi Agent Reinforcement Learning (MARL) with Non-Uniform Interaction?

Mondal¹,

Aggarwal²,

Ukkusuri³

2022

Preprint

View full text Add to dashboard Cite

show abstract

“…As already mentioned, the update rule (8.4) is provided here simply for the purpose of explaining the basic idea of Q-learning in a mean-field setup. We refer to [72,58,80,81,99] for more details on mean-field MDPs and Q-learning for mean field control. Other methods have also been investigated, such as policy gradient, see [110], which can be proved to converge with a linear rate for linearquadratic MFC problems, see [57].…”

Section: A Glance At Model-free Methodsmentioning

confidence: 99%

Numerical Methods for Mean Field Games and Mean Field Type Control

Laurière¹

2021

Preprint

View full text Add to dashboard Cite

Mean Field Games (MFG) have been introduced to tackle games with a large number of competing players. Considering the limit when the number of players is infinite, Nash equilibria are studied by considering the interaction of a typical player with the population's distribution. The situation in which the players cooperate corresponds to Mean Field Control (MFC) problems, which can also be viewed as optimal control problems driven by a McKean-Vlasov dynamics. These two types of problems have found a wide range of potential applications, for which numerical methods play a key role since most models do not have analytical solutions. In these notes, we review several aspects of numerical methods for MFG and MFC. We start by presenting some heuristics in a basic linear-quadratic setting. We then discuss numerical schemes for forward-backward systems of partial differential equations (PDEs), optimization techniques for variational problems driven by a Kolmogorov-Fokker-Planck PDE, an approach based on a monotone operator viewpoint, and stochastic methods relying on machine learning tools.

show abstract

“…Another related line of research including [11,24,25] establishes a dynamic programming principle for mean-field control problems (without regularization), where the problem is formulated as an MDP on the space of measures, and a Q-learning algorithm is designed for learning the optimal solution.…”

Section: Introductionmentioning

confidence: 99%

Exploratory LQG Mean Field Games with Entropy Regularization

Firoozi¹,

Jaimungal²

2020

Preprint

View full text Add to dashboard Cite

We study a general class of entropy-regularized multi-variate LQG mean field games (MFGs) in continuous time with K distinct subpopulation of agents. We extend the notion of actions to action distributions (exploratory actions), and explicitly derive the optimal action distributions for individual agents in the limiting MFG. We demonstrate that the optimal set of action distributions yields an ε-Nash equilibrium for the finite-population entropy-regularized MFG. Furthermore, we compare the resulting solutions with those of classical LQG MFGs and establish the equivalence of their existence.

show abstract

Mean-Field Controls with Q-Learning for Cooperative MARL: Convergence and Complexity Analysis

Cited by 27 publications

References 23 publications

Can Mean Field Control (MFC) Approximate Cooperative Multi Agent Reinforcement Learning (MARL) with Non-Uniform Interaction?

Can Mean Field Control (MFC) Approximate Cooperative Multi Agent Reinforcement Learning (MARL) with Non-Uniform Interaction?

Numerical Methods for Mean Field Games and Mean Field Type Control

Exploratory LQG Mean Field Games with Entropy Regularization

Contact Info

Product

Resources

About