2020
DOI: 10.1613/jair.1.11407
|View full text |Cite
|
Sign up to set email alerts
|

Sliding-Window Thompson Sampling for Non-Stationary Settings

Abstract: Multi-Armed Bandit (MAB) techniques have been successfully applied to many classes of sequential decision problems in the past decades. However, non-stationary settings -- very common in real-world applications -- received little attention so far, and theoretical guarantees on the regret are known only for some frequentist algorithms. In this paper, we propose an algorithm, namely Sliding-Window Thompson Sampling (SW-TS), for nonstationary stochastic MAB settings. Our algorithm is based on Thompson Sampling an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
28
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 35 publications
(28 citation statements)
references
References 18 publications
0
28
0
Order By: Relevance
“…The MAB algorithm we propose, named f-Discounted-Sliding-Window Thompson Sampling (f-dsw TS) , mixes two different approaches: a discount approach, inspired by the work of [ 36 , 37 ] and a sliding window approach, as in [ 15 , 37 ]. Our solution implicitly assigns more relevance to the recent evidences from the environment, as in the sliding window approach, given that these rewards are not discounted.…”
Section: Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…The MAB algorithm we propose, named f-Discounted-Sliding-Window Thompson Sampling (f-dsw TS) , mixes two different approaches: a discount approach, inspired by the work of [ 36 , 37 ] and a sliding window approach, as in [ 15 , 37 ]. Our solution implicitly assigns more relevance to the recent evidences from the environment, as in the sliding window approach, given that these rewards are not discounted.…”
Section: Methodsmentioning
confidence: 99%
“… Discounted TS : the TS enhanced with a discount factor, presented in [ 36 ]: the parameter controls the amount of discount. Sliding Window TS : the TS with a global sliding window, presented in [ 15 ]. The parameter n controls the size of the sliding window.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations