Thompson Sampling: An Asymptotically Optimal Finite Time Analysis

Kaufmann, Emilie; Korda, Nathaniel; Remi, Munos

doi:10.48550/arxiv.1205.4217

Cited by 1 publication

(1 citation statement)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Lai and Robbins (1985) proved a lower bound on the regret for any instancedependent bandit algorithm for the vanilla MAB. Kaufmann, Korda, and Munos (2012); Agrawal and Goyal (2012) analysed the Thompson sampling algorithm to solve the Karmed MAB for Bernoulli and Gaussian reward distributions respectively, and proved the asymptotic optimality in the Bernoulli setting relative to the lower bound given by Lai and Robbins (1985). Granmo (2008) proposed the Bayesian learning automaton that is self-correcting and converges to only pulling the optimal arm with probability 1.…”

Section: Introductionmentioning

confidence: 99%

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits

Chang

Tan

2022

AAAI

View full text Add to dashboard Cite

This paper unifies the design and the analysis of risk-averse Thompson sampling algorithms for the multi-armed bandit problem for a class of risk functionals ρ that are continuous and dominant. We prove generalised concentration bounds for these continuous and dominant risk functionals and show that a wide class of popular risk functionals belong to this class. Using our newly developed analytical toolkits, we analyse the algorithm ρ-MTS (for multinomial distributions) and prove that they admit asymptotically optimal regret bounds of risk-averse algorithms under the CVaR, proportional hazard, and other ubiquitous risk measures. More generally, we prove the asymptotic optimality of ρ-MTS for Bernoulli distributions for a class of risk measures known as empirical distribution performance measures (EDPMs); this includes the well-known mean-variance. Numerical simulations show that the regret bounds incurred by our algorithms are reasonably tight vis-à-vis algorithm-independent lower bounds.

show abstract