Tsallis-INF: An Optimal Algorithm for Stochastic and Adversarial Bandits

Zimmert, Julian; Seldin, Yevgeny

doi:10.48550/arxiv.1807.07623

Cited by 6 publications

(11 citation statements)

References 4 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We are now ready to show the main result of this section. Our proof follows the techniques of (Shamir and Zhang, 2013) combined with the analysis of FTRL in (Abernethy et al, 2015;Zimmert and Seldin, 2018). Let u ∈ W be a fixed vector in the convex set to be chosen later.…”

Section: A Last Iterate Convergence Of Ftrlmentioning

confidence: 98%

“…The proof of Theorem 5.1 follows the ideas for the analysis of FTRL in Abernethy et al ( 2015) and Zimmert and Seldin (2018), and combines them with the techniques used to obtain last iterate guarantees for stochastic gradient descent in Shamir and Zhang (2013). Since the analysis is somewhat standard we are going to devote the rest of the section to the privacy analysis.…”

Section: Proof Techniquesmentioning

confidence: 99%

See 1 more Smart Citation

Private Stochastic Convex Optimization: Efficient Algorithms for Non-smooth Objectives

Arora¹,

Marinov²,

Ullah³

2020

Preprint

View full text Add to dashboard Cite

In this paper, we revisit the problem of private stochastic convex optimization. We propose an algorithm, based on noisy mirror descent, which achieves optimal rates up to a logarithmic factor, both in terms of statistical complexity and number of queries to a first-order stochastic oracle. Unlike prior work, we do not require Lipschitz continuity of stochastic gradients to achieve optimal rates. Our algorithm generalizes beyond the Euclidean setting and yields anytime utility and privacy guarantees.

show abstract

Section: A Last Iterate Convergence Of Ftrlmentioning

confidence: 98%

Section: Proof Techniquesmentioning

confidence: 99%

Private Stochastic Convex Optimization: Efficient Algorithms for Non-smooth Objectives

Arora¹,

Marinov²,

Ullah³

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Lykouris et al [LMPL18] introduced a variant of the standard stochastic multi-armed bandit problem, where an adversary can corrupt a number of samples, and provided algorithms with learning rates that degrade according to the number of corruptions. The guarantees for stochastic multi-armed bandits were subsequently strengthened by Gupta et al [GKT19] and Zimmert and Seldin [ZS19], and the concept of adversarial corruptions has also been extended to several other settings including dynamic assortment optimization [CKW19], linear bandits [LLS19] and reinforcement learning [LSSS19]. Our work differs from these in that we use adversarial corruptions as a modeling tool to capture arbitrarily irrational agent behavior in game-theoretic settings.…”

Section: Related Workmentioning

confidence: 99%

Contextual Search in the Presence of Adversarial Corruptions

Krishnamurthy¹,

Lykouris²,

Podimata³

et al. 2020

Preprint

View full text Add to dashboard Cite

Standard game-theoretic formulations for settings like contextual pricing and security games assume that agents act in accordance with a specific behavioral model. In practice however, some agents may not prescribe to the dominant behavioral model or may act in ways that are arbitrarily inconsistent. Existing algorithms heavily depend on the model being (approximately) accurate for all agents and have poor performance in the presence of even a few such arbitrarily irrational agents. How do we design learning algorithms that are robust to the presence of arbitrarily irrational agents?We address this question for a number of canonical game-theoretic applications by designing a robust algorithm for the fundamental problem of multidimensional binary search. The performance of our algorithm degrades gracefully with the number of corrupted rounds, which correspond to irrational agents and need not be known in advance. As binary search is the key primitive in algorithms for contextual pricing, Stackelberg Security Games, and other gametheoretic applications, we immediately obtain robust algorithms for these settings.Our techniques draw inspiration from learning theory, game theory, high-dimensional geometry, and convex analysis, and may be of independent algorithmic interest.

show abstract

“…Bridging between the stochastic and adversarial settings in multi-armed bandits, and in online learning more generally, has been a topic of significant interest in recent years. Most research in this direction has focused on obtaining "the best of both worlds" guarantees (Bubeck and Slivkins, 2012;Seldin and Slivkins, 2014;Auer and Chiang, 2016;Seldin and Lugosi, 2017;Zimmert and Seldin, 2018;Zimmert et al, 2019). The goal there is to achieve the better of the bounds at the two extremes: the worst case O( √ T )-type bound on any problem instance, and the better O(log T )type bound whenever the instance is actually stochastic.…”

Section: Related Workmentioning

confidence: 99%

“…Adversarial contaminations similar to those considered here have been studied before in the context of bandit problems. Seldin and Slivkins (2014) and Zimmert and Seldin (2018) consider a "moderately contaminated" regime in which the adversarial corruptions do not reduce the gap min i =i ⋆ ∆ i by more than a constant factor at any point in time. This regime of contamination is very restrictive and, for example, precludes virtually any form of corruption in the early stages on learning.…”

Section: Related Workmentioning

confidence: 99%