Risk-aware multi-armed bandit problem with application to portfolio selection

Huo, Xiaoguang; Fu, Feng

doi:10.1098/rsos.171377

Cited by 49 publications

(27 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…They are commonly known to have applications in medical trials (Armitage [4] or Anscombe [3]) and experimental design (Berry and Fristedt [13] or the classic paper of Robbins [69]), along with other areas. A few recent works in finance for portfolio selection can also be found in Huo and Fu [45] or Shen et al [72]. The basic idea is that one has M 'bandits 2 ', or equivalently, a bandit with M arms, and one must choose which bandit should be played at each time.…”

Section: Multi-armed Banditsmentioning

confidence: 99%

Gittins’ theorem under uncertainty

Cohen

Treetanthiploet

2022

Electron. J. Probab.

View full text Add to dashboard Cite

We study dynamic allocation problems for discrete time multi-armed bandits under uncertainty, based on the the theory of nonlinear expectations. We show that, under independence assumption on the bandits and with some relaxation in the definition of optimality, a Gittins allocation index gives optimal choices. This involves studying the interaction of our uncertainty with controls which determine the filtration. We also run a simple numerical example which illustrates the interaction between the willingness to explore and uncertainty aversion of the agent when making decisions.

show abstract

Section: Multi-armed Banditsmentioning

confidence: 99%

Gittins’ theorem under uncertainty

Cohen

Treetanthiploet

2022

Electron. J. Probab.

View full text Add to dashboard Cite

show abstract

“…Of particular interest is to find the existence of gamblers who are clearly loss-exiting against gain-exiting. Besides, understanding individual gambling behaviour of choosing between and deploying betting systems that have varying risks from the perspective of the multi-armed bandit problem [ 19 ] is both interesting and promising.…”

Section: Discussionmentioning

confidence: 99%

Understanding gambling behaviour and risk attitudes using cryptocurrency-based casino blockchain data

Meng

2020

R. Soc. open sci.

Self Cite

View full text Add to dashboard Cite

The statistical concept of gambler’s ruin suggests that gambling has a large amount of risk. Nevertheless, gambling at casinos and gambling on the Internet are both hugely popular activities. In recent years, both prospect theory and laboratory-controlled experiments have been used to improve our understanding of risk attitudes associated with gambling. Despite theoretical progress, collecting real-life gambling data, which is essential to validate predictions and experimental findings, remains a challenge. To address this issue, we collect publicly available betting data from a DApp (decentralized application) on the Ethereum blockchain, which instantly publishes the outcome of every single bet (consisting of each bet’s timestamp, wager, probability of winning, userID and profit). This online casino is a simple dice game that allows gamblers to tune their own winning probabilities. Thus the dataset is well suited for studying gambling strategies and the complex dynamic of risk attitudes involved in betting decisions. We analyse the dataset through the lens of current probability-theoretic models and discover empirical examples of gambling systems. Our results shed light on understanding the role of risk preferences in human financial behaviour and decision-makings beyond gambling.

show abstract

“…Risk-aware methods [8,17,41] are online learning algorithms that model the risk associated with executing certain actions as a cost which is to be constrained and minimized. Galichet et al [8] introduce the concept of risk-awareness for the multi-armed bandit framework.…”

Section: Safety In Contextual Banditsmentioning

confidence: 99%

Safe Exploration for Optimizing Contextual Bandits

Jagerman

Markov

Rijke

2020

ACM Trans. Inf. Syst.

View full text Add to dashboard Cite

Contextual bandit problems are a natural fit for many information retrieval tasks, such as learning to rank, text classification, recommendation, and so on. However, existing learning methods for contextual bandit problems have one of two drawbacks: They either do not explore the space of all possible document rankings (i.e., actions) and, thus, may miss the optimal ranking, or they present suboptimal rankings to a user and, thus, may harm the user experience. We introduce a new learning method for contextual bandit problems, Safe Exploration Algorithm (SEA), which overcomes the above drawbacks. SEA starts by using a baseline (or production) ranking system (i.e., policy), which does not harm the user experience and, thus, is safe to execute but has suboptimal performance and, thus, needs to be improved. Then SEA uses counterfactual learning to learn a new policy based on the behavior of the baseline policy. SEA also uses high-confidence off-policy evaluation to estimate the performance of the newly learned policy. Once the performance of the newly learned policy is at least as good as the performance of the baseline policy, SEA starts using the new policy to execute new actions, allowing it to actively explore favorable regions of the action space. This way, SEA never performs worse than the baseline policy and, thus, does not harm the user experience, while still exploring the action space and, thus, being able to find an optimal policy. Our experiments using text classification and document retrieval confirm the above by comparing SEA (and a boundless variant called BSEA) to online and offline learning methods for contextual bandit problems.

show abstract

Risk-aware multi-armed bandit problem with application to portfolio selection

Cited by 49 publications

References 49 publications

Gittins’ theorem under uncertainty

Gittins’ theorem under uncertainty

Understanding gambling behaviour and risk attitudes using cryptocurrency-based casino blockchain data

Safe Exploration for Optimizing Contextual Bandits

Contact Info

Product

Resources

About