Lorenzo Bisi scite author profile

Lorenzo Bisi

4Publications

54Citation Statements Received

40Citation Statements Given

How they've been cited

How they cite others

Affiliations

Cube Technology (United States), Politecnico di Milano, ISI Foundation

Publications

Order By: Most citations

Risk-Averse Trust Region Optimization for Reward-Volatility Reduction

Bisi

Sabbioni

Vittori

et al. 2020

View full text Add to dashboard Cite

The use of reinforcement learning in algorithmic trading is of growing interest, since it offers the opportunity of making profit through the development of autonomous artificial traders, that do not depend on hard-coded rules. In such a framework, keeping uncertainty under control is as important as maximizing expected returns. Risk aversion has been addressed in reinforcement learning through measures related to the distribution of returns. However, in trading it is essential to keep under control the risk of portfolio positions in the intermediate steps. In this paper, we define a novel measure of risk, which we call reward volatility, consisting of the variance of the rewards under the state-occupancy measure. This new risk measure is shown to bound the return variance so that reducing the former also constrains the latter. We derive a policy gradient theorem with a new objective function that exploits the mean-volatility relationship. Furthermore, we adapt TRPO, the well-known policy gradient algorithm with monotonic improvement guarantees, in a risk-averse manner. Finally, we test the proposed approach in two financial environments using real market data.

show abstract

Risk-Averse Trust Region Optimization for Reward-Volatility Reduction

Bisi¹,

Sabbioni²,

Vittori³

et al. 2019

Preprint

View full text Add to dashboard Cite

Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning

Metelli¹,

Mazzolini²,

Bisi³

et al. 2020

Preprint

View full text Add to dashboard Cite

The choice of the control frequency of a system has a relevant impact on the ability of reinforcement learning algorithms to learn a highly performing policy. In this paper, we introduce the notion of action persistence that consists in the repetition of an action for a fixed number of decision steps, having the effect of modifying the control frequency. We start analyzing how action persistence affects the performance of the optimal policy, and then we present a novel algorithm, Persistent Fitted Q-Iteration (PFQI), that extends FQI, with the goal of learning the optimal value function at a given persistence. After having provided a theoretical study of PFQI and a heuristic approach to identify the optimal persistence, we present an experimental campaign on benchmark domains to show the advantages of action persistence and proving the effectiveness of our persistence selection method.

show abstract

Risk-averse policy optimization via risk-neutral policy optimization

Bisi

Santambrogio

Sandrelli

et al. 2022

Artificial Intelligence

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Lorenzo Bisi

Risk-Averse Trust Region Optimization for Reward-Volatility Reduction

Risk-Averse Trust Region Optimization for Reward-Volatility Reduction

Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning

Risk-averse policy optimization via risk-neutral policy optimization

Contact Info

Product

Resources

About