2021
DOI: 10.48550/arxiv.2102.05858
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Achieving Near Instance-Optimality and Minimax-Optimality in Stochastic and Adversarial Linear Bandits Simultaneously

Abstract: In this work, we develop linear bandit algorithms that automatically adapt to different environments. By plugging a novel loss estimator into the optimization problem that characterizes the instance-optimal strategy, our first algorithm not only achieves nearly instance-optimal regret in stochastic environments, but also works in corrupted environments with additional regret being the amount of corruption, while the state-of-the-art (Li et al., 2019) achieves neither instance-optimality nor the optimal depend… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
19
2

Year Published

2021
2021
2022
2022

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(21 citation statements)
references
References 6 publications
0
19
2
Order By: Relevance
“…Remark Lee et al (2021), our result has a multiplicative quadratic dependence on C, which seems to be worse. Nevertheless, we want to emphasize that we focus on the linear contextual bandit setting, where the decision sets D t at each round are not identical, which is more challenging than stochastic linear bandit setting in Lee et al (2021), where the decision set is pregiven before the execution of the algorithm and fixed during the execution of the algorithm. Therefore, our result and that in Lee et al (2021) are not directly comparable.…”
Section: Resultsmentioning
confidence: 72%
See 2 more Smart Citations
“…Remark Lee et al (2021), our result has a multiplicative quadratic dependence on C, which seems to be worse. Nevertheless, we want to emphasize that we focus on the linear contextual bandit setting, where the decision sets D t at each round are not identical, which is more challenging than stochastic linear bandit setting in Lee et al (2021), where the decision set is pregiven before the execution of the algorithm and fixed during the execution of the algorithm. Therefore, our result and that in Lee et al (2021) are not directly comparable.…”
Section: Resultsmentioning
confidence: 72%
“…Nevertheless, we want to emphasize that we focus on the linear contextual bandit setting, where the decision sets D t at each round are not identical, which is more challenging than stochastic linear bandit setting in Lee et al (2021), where the decision set is pregiven before the execution of the algorithm and fixed during the execution of the algorithm. Therefore, our result and that in Lee et al (2021) are not directly comparable.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…data, Wei et al (2020) show that a martingale version of Catoni is possible, which is what we apply in this work. We remark that several applications of the Catoni estimator to linear bandits have been proposed recently (Camilleri et al, 2021;Lee et al, 2021). We refer the reader to the survey Lugosi & Mendelson (2019) for a discussion of other robust mean estimators.…”
Section: Related Workmentioning
confidence: 94%
“…For MAB, there are a number of studies on best-of-both-worlds algorithms [Bubeck and Slivkins, 2012, Zimmert and Seldin, 2021, Seldin and Slivkins, 2014, Seldin and Lugosi, 2017, Pogodin and Lattimore, 2020, Auer and Chiang, 2016, Wei and Luo, 2018, Zimmert et al, 2019, Lee et al, 2021, Ito, 2021. Among these, studies by Wei and Luo [2018], Zimmert and Seldin [2021], Zimmert et al [2019] are closely related to this work.…”
Section: Related Workmentioning
confidence: 99%