Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing 2021
DOI: 10.1145/3406325.3451004
|View full text |Cite
|
Sign up to set email alerts
|

Linear bandits with limited adaptivity and learning distributional optimal design

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(14 citation statements)
references
References 15 publications
0
14
0
Order By: Relevance
“…Bandits with limited adaptivity complexity. There is a lot of interest in obtaining bandit algorithms that update their policies rarely (Abbasi-Yadkori et al 2011, Perchet et al 2016, Gao et al 2019, Dong et al 2020, Chen et al 2020, Ruan et al 2021. Notably, Dong et al (2020) study rare policy switching constraints for a broader class of online learning and decision making problems such as logit bandits.…”
Section: Related Workmentioning
confidence: 99%
“…Bandits with limited adaptivity complexity. There is a lot of interest in obtaining bandit algorithms that update their policies rarely (Abbasi-Yadkori et al 2011, Perchet et al 2016, Gao et al 2019, Dong et al 2020, Chen et al 2020, Ruan et al 2021. Notably, Dong et al (2020) study rare policy switching constraints for a broader class of online learning and decision making problems such as logit bandits.…”
Section: Related Workmentioning
confidence: 99%
“…We prove that while the communication cost of DisBE-LUCB is only Õ(dN ), it achieves a regret Õ( √ dN T ), which is of the same order as that incurred by a near optimal single-agent algorithm for N T rounds. We highlight that while DisBE-LUCB is inspired by the single-agent BatchLinUCB-DG proposed in [Ruan et al, 2021] in an attempt to save on communication as much as possible, a direct use of confidence set in [Ruan et al, 2021] would fail to guarantee optimal communication cost Õ(dN ) and require more communication for each agent. We address this issue by introducing a new confidence set in DisBE-LUCB and Lemma 1 as our first main technical contribution.…”
Section: Contributionsmentioning
confidence: 99%
“…An important line of work related to communication efficiency in distributed bandits studies practical single agent scenarios using batch elimination methods, in which a very small number of batches may achieve minimax optimal learning performance, and therefore it is possible to enjoy the benefits of both adaptivity and parallelism [Ruan et al, 2021, Han et al, 2020, Gao et al, 2019. Our proposed algorithms are inspired by the single-agent BatchLinUCB-DG proposed in [Ruan et al, 2021] in an attempt to save on communication as much as possible. That said, a direct use and analysis of confidence set in [Ruan et al, 2021] would fail to guarantee optimal communication cost Õ(dN ) and require more communication for each agent.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations