Reinforcement Learning Algorithm for Mixed Mean Field Control Games

Angiuli, Andrea; Detering, Nils; Fouque, Jean‐Pierre; Laurière, Mathieu; Lin, Jimin

doi:10.48550/arxiv.2205.02330

Cited by 3 publications

(9 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As mentioned above, this paper proves the convergence of the algorithms proposed in [Angiuli et al, 2022b, Angiuli et al, 2022a. The algorithms have been extended to the finite time horizon setting in [Angiuli et al, 2023b] and to deep actor-critic methods in [Angiuli et al, 2023a].…”

Section: Related Worksupporting

confidence: 55%

“…We then provide an numerical example illustrating the convergence. Last, we generalize our convergence result to a three-timescale RL algorithm introduced in [Angiuli et al, 2022a] to solve mixed Mean Field Control Games (MFCGs).…”

mentioning

confidence: 85%

“…In this paper, we analyze multi-timescales algorithms that have been introduced and studied numerically in [Angiuli et al, 2022b, Angiuli et al, 2022a for mean field games, mean field control problems and mixed mean field control games. As in these references, the problems are studied in the context of finite state and action spaces, and in infinite time horizon.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Unified reinforcement Q-learning for mean field game and control problems

Angiuli

Fouque

Laurière

2022

Math. Control Signals Syst.

Self Cite

View full text Add to dashboard Cite

We establish the convergence of the unified two-timescale Reinforcement Learning (RL) algorithm presented in [Angiuli et al., 2022b]. This algorithm provides solutions to Mean Field Game (MFG) or Mean Field Control (MFC) problems depending on the ratio of two learning rates, one for the value function and the other for the mean field term. We focus a setting with finite state and action spaces, discrete time and infinite horizon. The proof of convergence relies on a generalization of the two-timescale approach of [Borkar, 1997]. The accuracy of approximation to the true solutions depends on the smoothing of the policies. We then provide an numerical example illustrating the convergence. Last, we generalize our convergence result to a three-timescale RL algorithm introduced in [Angiuli et al., 2022a] to solve mixed Mean Field Control Games (MFCGs).

show abstract

Section: Related Worksupporting

confidence: 55%

mentioning

confidence: 85%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Unified reinforcement Q-learning for mean field game and control problems

Angiuli

Fouque

Laurière

2022

Math. Control Signals Syst.

Self Cite

View full text Add to dashboard Cite

show abstract

“…The Markov decision problem leads to an infinite horizon stochastic optimal control problem in discrete-time, which finds many applications in finance and economics, compare, for example, Bäuerle and Rieder (2011), Hambly et al (2021), or White (1993) for an overview. It can, among a multitude of other applications, be used to learn the optimal structure of portfolios and the optimal trading behavior, see, for example, Bertoluzzo and Corazza (2012), Chang and Lee (2017), Gold (2003), Hu and Lin (2019), Xiong et al (2018), to learn optimal hedging strategies, see, for example, Angiuli et al (2022), Angiuli et al (2021), Cao et al (2021), Dixon et al (2020), , Halperin (2020), Li et al (2009), Schäl (2002), to optimize inventory-production systems (Uğurlu, 2017), or to study socio-economic systems under the influence of climate change as in Shuvo et al (2020).…”

Section: Introductionmentioning

confidence: 99%

Markov decision processes under model uncertainty

Neufeld¹,

Sester

Sikic

2023

Mathematical Finance

View full text Add to dashboard Cite

We introduce a general framework for Markov decision problems under model uncertainty in a discretetime infinite horizon setting. By providing a dynamic programming principle, we obtain a local-to-global paradigm, namely solving a local, that is, a one timestep robust optimization problem leads to an optimizer of the global (i.e., infinite time-steps) robust stochastic optimal control problem, as well as to a corresponding worst-case measure. Moreover, we apply this framework to portfolio optimization involving data of the 𝑆&𝑃 500.We present two different types of ambiguity sets; one is fully data-driven given by a Wasserstein-ball around the empirical measure, the second one is described by a parametric set of multivariate normal distributions, where the corresponding uncertainty sets of the parameters are estimated from the data. It turns out that in scenarios where the market is volatile or bearish, the optimal portfolio strategies from the corresponding robust optimization problem outperforms the ones without model uncertainty, showcasing the importance of taking model uncertainty into account.

show abstract

“…[9], [13], [19], [24], [40], to learn optimal hedging strategies, see, e.g. [3], [4], [12], [16], [17], [21], [29], [34], or even to study socio-economic systems under the influence of climate change as in [35].…”

Section: Introductionmentioning

confidence: 99%

Markov Decision Processes under Model Uncertainty

Neufeld¹,

Sester²,

Sikic³

2022

Preprint

View full text Add to dashboard Cite

We introduce a general framework for Markov decision problems under model uncertainty in a discrete-time infinite horizon setting. By providing a dynamic programming principle we obtain a local-to-global paradigm, namely solving a local, i.e., a one time-step robust optimization problem leads to an optimizer of the global (i.e. infinite time-steps) robust stochastic optimal control problem, as well as to a corresponding worst-case measure.Moreover, we apply this framework to portfolio optimization involving data of the S&P 500. We present two different types of ambiguity sets; one is fully data-driven given by a Wasserstein-ball around the empirical measure, the second one is described by a parametric set of multivariate normal distributions, where the corresponding uncertainty sets of the parameters are estimated from the data. It turns out that in scenarios where the market is volatile or bearish, the optimal portfolio strategies from the corresponding robust optimization problem outperforms the ones without model uncertainty, showcasing the importance of taking model uncertainty into account.

show abstract

Reinforcement Learning Algorithm for Mixed Mean Field Control Games

Cited by 3 publications

References 12 publications

Unified reinforcement Q-learning for mean field game and control problems

Unified reinforcement Q-learning for mean field game and control problems

Markov decision processes under model uncertainty

Markov Decision Processes under Model Uncertainty

Contact Info

Product

Resources

About