Adaptive Gradient Methods with Local Guarantees

Zhang, Lu; Xia, Wenhan; Arora, Sanjeev; Hazan, Elad

doi:10.48550/arxiv.2203.01400

Cited by 2 publications

(2 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We employ switching regret for non-convex optimization. More refined analysis may be possible via generalizations such as strongly adaptive or dynamic regret (Daniely et al, 2015;Jun et al, 2017;Zhang et al, 2018;Jacobsen & Cutkosky, 2022;Cutkosky, 2020;Lu et al, 2022;Luo et al, 2022;Zhang et al, 2021;Baby & Wang, 2022;Zhang et al, 2022). Moreover, our analysis assumes perfect tuning of constants (e.g., D, T, K) for simplicity.…”

Section: Discussionmentioning

confidence: 99%

Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

Cutkosky¹,

Mehta²,

Orabona³

2023

Preprint

View full text Add to dashboard Cite

We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a (δ, ǫ)-stationary point from O(ǫ −4 δ −1 ) stochastic gradient queries to O(ǫ −3 δ −1 ), which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of O(ǫ −1.5 δ −0.5 ). Our techniques also recover all optimal or best-known results for finding ǫ stationary points of smooth or secondorder smooth objectives in both stochastic and deterministic settings.

show abstract

Section: Discussionmentioning

confidence: 99%

Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

Cutkosky¹,

Mehta²,

Orabona³

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…This bound was further improved to O( |I| log T ) by [9] using a coin-betting technique. Recently, [2] achieved a more refined second-order bound Õ( t∈I ∇ t 2 ), and [10] further improved it to Õ(min H 0,T r(H)≤d t∈I ∇ ⊤ t H −1 ∇ t ), which matches the regret of Adagrad [4]. However, these algorithms are all based on the initial exponential-lookback technique of [7], and requires Θ(log T ) experts per round, increasing the computational complexity of the base algorithm in their reduction by this factor.…”

Section: Related Workmentioning

confidence: 99%

On the Computational Efficiency of Adaptive and Dynamic Regret Minimization

Zhang¹,

Hazan²

2022

Preprint

Self Cite

View full text Add to dashboard Cite

In online convex optimization the player aims to minimize her regret against a fixed comparator over the entire repeated game. Algorithms that minimize standard regret may converge to a fixed decision, which is undesireable in changing or dynamic environments. This motivates the stronger metric of adaptive regret, or the maximum regret over any continuous sub-interval in time.Existing adaptive regret algorithms suffer from a computational penalty -typically on the order of a multiplicative factor that grows logarithmically in the number of game iterations. In this paper we show how to reduce this computational penalty to be doubly logarithmic in the number of game iterations, and with minimal degradation to the optimal attainable adaptive regret bounds.

show abstract

Adaptive Gradient Methods with Local Guarantees

Cited by 2 publications

References 13 publications

Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

On the Computational Efficiency of Adaptive and Dynamic Regret Minimization

Contact Info

Product

Resources

About