We establish the convergence of the unified two-timescale Reinforcement Learning (RL) algorithm presented in [Angiuli et al., 2022b]. This algorithm provides solutions to Mean Field Game (MFG) or Mean Field Control (MFC) problems depending on the ratio of two learning rates, one for the value function and the other for the mean field term. We focus a setting with finite state and action spaces, discrete time and infinite horizon. The proof of convergence relies on a generalization of the two-timescale approach of [Borkar, 1997]. The accuracy of approximation to the true solutions depends on the smoothing of the policies. We then provide an numerical example illustrating the convergence. Last, we generalize our convergence result to a three-timescale RL algorithm introduced in [Angiuli et al., 2022a] to solve mixed Mean Field Control Games (MFCGs).
This project investigates numerical methods for solving fully coupled forward-backward stochastic differential equations (FBSDEs) of McKean-Vlasov type. Having numerical solvers for such mean field FBSDEs is of interest because of the potential application of these equations to optimization problems over a large population, say for instance mean field games (MFG) and optimal mean field control problems. Theory for this kind of problems has met with great success since the early works on mean field games by Lasry and Lions, see [27], and by Huang, Caines, and Malhamé, see [24]. Generally speaking, the purpose is to understand the continuum limit of optimizers or of equilibria (say in Nash sense) as the number of underlying players tends to infinity. When approached from the probabilistic viewpoint, solutions to these control problems (or games) can be described by coupled mean field FBSDEs, meaning that the coefficients depend upon the own marginal laws of the solution. In this note, we detail two methods for solving such FBSDEs which we implement and apply to five benchmark problems. The first method uses a tree structure to represent the pathwise laws of the solution, whereas the second method uses a grid discretization to represent the time marginal laws of the solutions. Both are based on a Picard scheme; importantly, we combine each of them with a generic continuation method that permits to extend the time horizon (or equivalently the coupling strength between the two equations) for which the Picard iteration converges.
We present a Reinforcement Learning (RL) algorithm to solve infinite horizon asymptotic Mean Field Game (MFG) and Mean Field Control (MFC) problems. Our approach can be described as a unified two-timescale Mean Field Q-learning: The same algorithm can learn either the MFG or the MFC solution by simply tuning a parameter. The algorithm is in discrete time and space where the agent not only provides an action to the environment but also a distribution of the state in order to take into account the mean field feature of the problem. Importantly, we assume that the agent can not observe the population's distribution and needs to estimate it in a model-free manner. The asymptotic MFG and MFC problems are presented in continuous time and space, and compared with classical (non-asymptotic or stationary) MFG and MFC problems. They lead to explicit solutions in the linear-quadratic (LQ) case that are used as benchmarks for the results of our algorithm. Contents
We present a new combined Mean Field Control Game (MFCG) problem which can be interpreted as a competitive game between collaborating groups and its solution as a Nash equilibrium between the groups. Within each group the players coordinate their strategies. An example of such a situation is a modification of the classical trader's problem. Groups of traders maximize their wealth. They are faced with transaction cost for their own trades and a cost for their own terminal position. In addition they face a cost for the average holding within their group. The asset price is impacted by the trades of all agents. We propose a reinforcement learning algorithm to approximate the solution of such mixed Mean Field Control Game problems. We test the algorithm on benchmark linear-quadratic specifications for which we have analytic solutions.
Mean field games (MFG) and mean field control problems (MFC) are frameworks to study Nash equilibria or social optima in games with a continuum of agents. These problems can be used to approximate competitive or cooperative games with a large finite number of agents and have found a broad range of applications, in particular in economics. In recent years, the question of learning in MFG and MFC has garnered interest, both as a way to compute solutions and as a way to model how large populations of learners converge to an equilibrium. Of particular interest is the setting where the agents do not know the model, which leads to the development of reinforcement learning (RL) methods. After reviewing the literature on this topic, we present a two timescale approach with RL for MFG and MFC, which relies on a unified Q-learning algorithm. The main novelty of this method is to simultaneously update an action-value function and a distribution but with different rates, in a model-free fashion. Depending on the ratio of the two learning rates, the algorithm learns either the MFG or the MFC solution. To illustrate this method, we apply it to a mean field problem of accumulated consumption in finite horizon with HARA utility function, and to a trader's optimal liquidation problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.