Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence 2020
DOI: 10.24963/ijcai.2020/627
|View full text |Cite
|
Sign up to set email alerts
|

An End-to-End Optimal Trade Execution Framework based on Proximal Policy Optimization

Abstract: In this article, we propose an end-to-end adaptive framework for optimal trade execution based on Proximal Policy Optimization (PPO). We use two methods to account for the time dependencies in the market data based on two different neural network architecture: 1) Long short-term memory (LSTM) networks, 2) Fully-connected networks (FCN) by stacking the most recent limit orderbook (LOB) information as model inputs. The proposed framework can make trade execution decisions based on level-2 limit order boo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
29
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 20 publications
(29 citation statements)
references
References 8 publications
(11 reference statements)
0
29
0
Order By: Relevance
“…The most popular types of RL methods that have been used in optimal execution problems are Q-learning algorithms and (double) DQN [102,159,241,108,51,181,158]. Policy-based algorithms are also popular in this field, including (deep) policy gradient methods [100,241], A2C [241], PPO [51,140], and DDPG [235]. The benchmark strategies studied in these papers include the Almgren-Chriss solution [102,100], the TWAP strategy [159,51,140], the VWAP strategy [140], and the SnL policy [158,235].…”
Section: Optimal Executionmentioning
confidence: 99%
See 3 more Smart Citations
“…The most popular types of RL methods that have been used in optimal execution problems are Q-learning algorithms and (double) DQN [102,159,241,108,51,181,158]. Policy-based algorithms are also popular in this field, including (deep) policy gradient methods [100,241], A2C [241], PPO [51,140], and DDPG [235]. The benchmark strategies studied in these papers include the Almgren-Chriss solution [102,100], the TWAP strategy [159,51,140], the VWAP strategy [140], and the SnL policy [158,235].…”
Section: Optimal Executionmentioning
confidence: 99%
“…Policy-based algorithms are also popular in this field, including (deep) policy gradient methods [100,241], A2C [241], PPO [51,140], and DDPG [235]. The benchmark strategies studied in these papers include the Almgren-Chriss solution [102,100], the TWAP strategy [159,51,140], the VWAP strategy [140], and the SnL policy [158,235]. In some models the trader is allowed to buy or sell the asset at each time point [108,241,217,56], whereas there are also many models where only one trading direction is allowed [158,102,100,159,51,181,235,140].…”
Section: Optimal Executionmentioning
confidence: 99%
See 2 more Smart Citations
“…PPO is another widely used RL method for OE. Lin and Beling [78] proposed an end-to-end PPO-based framework.…”
Section: Rl In Order Executionmentioning
confidence: 99%