We present a framework for hedging a portfolio of derivatives in the presence of market frictions such as transaction costs, market impact, liquidity constraints or risk limits using modern deep reinforcement machine learning methods.We discuss how standard reinforcement learning methods can be applied to non-linear reward structures, i.e. in our case convex risk measures. As a general contribution to the use of deep learning for stochastic processes, we also show in section 4 that the set of constrained trading strategies used by our algorithm is large enough to -approximate any optimal solution.Our algorithm can be implemented efficiently even in high-dimensional situations using modern machine learning tools. Its structure does not depend on specific market dynamics, and generalizes across hedging instruments including the use of liquid derivatives. Its computational performance is largely invariant in the size of the portfolio as it depends mainly on the number of hedging instruments available.We illustrate our approach by showing the effect on hedging under transaction costs in a synthetic market driven by the Heston model, where we outperform the standard "complete market" solution.
No abstract
The universal approximation properties with respect to L p -type criteria of three important families of reservoir computers with stochastic discrete-time semi-infinite inputs are shown. First, it is proved that linear reservoir systems with either polynomial or neural network readout maps are universal. More importantly, it is proved that the same property holds for two families with linear readouts, namely, trigonometric state-affine systems and echo state networks, which are the most widely used reservoir systems in applications. The linearity in the readouts is a key feature in supervised machine learning applications. It guarantees that these systems can be used in high-dimensional situations and in the presence of large datasets. The L p criteria used in this paper allow the formulation of universality results that do not necessarily impose almost sure uniform boundedness in the inputs or the fading memory property in the filter that needs to be approximated.
This work studies approximation based on single-hidden-layer feedforward and recurrent neural networks with randomly generated internal weights. These methods, in which only the last layer of weights and a few hyperparameters are optimized, have been successfully applied in a wide range of static and dynamic learning problems. Despite the popularity of this approach in empirical tasks, important theoretical questions regarding the relation between the unknown function, the weight distribution, and the approximation rate have remained open. In this work it is proved that, as long as the unknown function, functional, or dynamical system is sufficiently regular, it is possible to draw the internal weights of the random (recurrent) neural network from a generic distribution (not depending on the unknown object) and quantify the error in terms of the number of neurons and the hyperparameters. In particular, this proves that echo state networks with randomly generated weights are capable of approximating a wide class of dynamical systems arbitrarily well and thus provides the first mathematical explanation for their empirically observed success at learning dynamical systems.
Abstract-Ultra-wideband (UWB) localization is a recent technology that promises to outperform many indoor localization methods currently available. Yet, non-line-of-sight (NLOS) positioning scenarios can create large biases in the time-difference-of-arrival (TDOA) measurements, and must be addressed with accurate measurement models in order to avoid significant localization errors. In this work, we first develop an efficient, closed-form TDOA error model and analyze its estimation characteristics by calculating the Cramér-Rao lower bound (CRLB). We subsequently detail how an online Expectation Maximization (EM) algorithm is adopted to find an elegant formalism for the maximum likelihood estimate of the model parameters. We perform real experiments on a mobile robot equipped with an UWB emitter, and show that the online estimation algorithm leads to excellent localization performance due to its ability to adapt to the varying NLOS path conditions over time.
We study risk-sharing equilibria with general convex costs on the agents' trading rates. For an infinite-horizon model with linear state dynamics and exogenous volatilities, the equilibrium returns mean-revert around their frictionless counterparts -the deviation has Ornstein-Uhlenbeck dynamics for quadratic costs whereas it follows a doubly-reflected Brownian motion if costs are proportional. More general models with arbitrary state dynamics and endogenous volatilities lead to multidimensional systems of nonlinear, fully-coupled forward-backward SDEs. These fall outside the scope of known wellposedness results, but can be solved numerically using the simulation-based deep-learning approach of [28]. In a calibration to time series of returns, bidask spreads, and trading volume, transaction costs substantially affect equilibrium asset prices. In contrast, the effects of different cost specifications are rather similar, justifying the use of quadratic costs as a proxy for other less tractable specifications.Mathematics Subject Classification: (2010) 91G10, 91G80, 60H10.JEL Classification: C68, D52, G11, G12.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.