ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020
DOI: 10.1109/icassp40776.2020.9054063
|View full text |Cite
|
Sign up to set email alerts
|

Generalized Linear Bandits with Safety Constraints

Abstract: We study decentralized stochastic linear bandits, where a network of N agents acts cooperatively to efficiently solve a linear bandit-optimization problem over a d-dimensional space. For this problem, we propose DLUCB: a fully decentralized algorithm that minimizes the cumulative regret over the entire network. At each round of the algorithm each agent chooses its actions following an upper confidence bound (UCB) strategy and agents share information with their immediate neighbors through a carefully designed … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 7 publications
(16 citation statements)
references
References 16 publications
(9 reference statements)
0
16
0
Order By: Relevance
“…The assumption warrants need for a safe starting point which is readily available in most practical problems of interest. Similar assumptions can be found in previous literature on safe linear bandits [53], [54], safe convex and non-convex optimization [55], [56], and safe online convex optimization [15].…”
Section: A Assumptionsmentioning
confidence: 62%
See 2 more Smart Citations
“…The assumption warrants need for a safe starting point which is readily available in most practical problems of interest. Similar assumptions can be found in previous literature on safe linear bandits [53], [54], safe convex and non-convex optimization [55], [56], and safe online convex optimization [15].…”
Section: A Assumptionsmentioning
confidence: 62%
“…d) IV) Safe Online Optimization:: Safe optimization is a fairly nascent field with only a few works studying pertime safety in optimization problems. In [53], [54] study the problem of safe linear bandits giving O(log(T )…”
Section: B Contributionsmentioning
confidence: 99%
See 1 more Smart Citation
“…In contrast to our setting, the paper [Usmanova et al, 2019] requires multiple measurements of the constraint at each round of the algorithm. Other closely related works of [Amani et al, 2019, Amani et al, 2020 study the problem of safe linear and generalized linear stochastic bandit where the constraint and loss functions depend linearly (directly or via a link function) on an unknown parameter. In fact, our algorithm can be seen as an extension of Safe-LUCB proposed by [Amani et al, 2019] to safe GPs.…”
Section: Related Workmentioning
confidence: 99%
“…Perhaps closest to our work is that of the setting in which there exist two distributions, one over rewards for actions, and one over costs. The goal is to maximize the expected reward, while ensuring that the expected cost of the selected action is below a certain threshold (Amani et al, 2019;Moradipari et al, 2021;Pacchiano et al, 2021). Crucially none of these frameworks allow for observing the constrained only on an uncontrolled subset of the rounds, which is a key challenge of the CBUS setting.…”
Section: Introductionmentioning
confidence: 99%