Mid-price movement prediction based on limit order book data is a challenging task due to the complexity and dynamics of the limit order book. So far, there have been very limited attempts for extracting relevant features based on limit order book data. In this paper, we address this problem by designing a new set of handcrafted features and performing an extensive experimental evaluation on both liquid and illiquid stocks. More specifically, we present an extensive set of econometric features that capture statistical properties of the underlying securities for the task of mid-price prediction. The experimental evaluation consists of a head-to-head comparison with other handcrafted features from the literature and with features extracted from a long short-term memory autoencoder by means of a fully automated process. Moreover, we develop a new experimental protocol for online learning that treats the task above as a multi-objective optimization problem and predicts i) the direction of the next price movement and ii) the number of order book events that occur until the change takes place. In order to predict the mid-price movement, features are fed into nine different deep learning models based on multi-layer perceptrons, convolutional neural networks, and long short-term memory neural networks. The performance of the proposed method is then evaluated on liquid and illiquid stocks (i.e., TotalView-ITCH US and Nordic stocks). For some stocks, results suggest that the correct choice of a feature set and a model can lead to the successful prediction of how long it takes to have a stock price movement.
We consider the log-linear relationship between futures contracts and their underlying assets and show that in the classical Brownian semi-martingale (BSM) framework the two series must, by no-arbitrage, have the same integrated variance. We discuss the negligibility of stochastic interest rates using empirical evidence and in simulations. We then introduce the concept of noise cancellation and propose a generally applicable methodology to assess the performance of realized measures when the variable of interest is latent, overcoming the problem posed by the lack of a true value for the integrated variance. We carry out formal testing of several realized measures in the presence of noise and conduct a thorough simulation analysis to evaluate the estimators' sensitivity to different price and noise processes, sampling frequencies and stochastic components.
No abstract
In this paper we propose a straightforward approach to obtain a more efficient estimate of the integrated variance of an asset through a cross-sectional combination with a futures contract written on it. Our method constructs a variance-preserving series with reduced noise size as a linear combination of the underlying asset and the futures and base measurement of the integrated variance on this new series. We first illustrate how a theoretically but infeasible optimal series can be obtained and then suggest a feasible procedure to attain noise reduction. In a simulation study we verify how prevalent estimators of integrated variance applied to such noise-reduced series outperform estimators applied directly to the asset price. Finally, we apply the method to an empirical data set and, through the stabilized signature plot, we show how the noise reduced series provides consistent integrated variance estimates using naive realized measures at very high frequencies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.