“…The Markov decision problem leads to an infinite horizon stochastic optimal control problem in discrete-time, which finds many applications in finance and economics, compare, for example, Bäuerle and Rieder (2011), Hambly et al (2021), or White (1993) for an overview. It can, among a multitude of other applications, be used to learn the optimal structure of portfolios and the optimal trading behavior, see, for example, Bertoluzzo and Corazza (2012), Chang and Lee (2017), Gold (2003), Hu and Lin (2019), Xiong et al (2018), to learn optimal hedging strategies, see, for example, Angiuli et al (2022), Angiuli et al (2021), Cao et al (2021), Dixon et al (2020), , Halperin (2020), Li et al (2009), Schäl (2002), to optimize inventory-production systems (Uğurlu, 2017), or to study socio-economic systems under the influence of climate change as in Shuvo et al (2020).…”