On the Convergence of SARAH and Beyond

Li, Bingcong; Ma, Meng; Giannakis, Georgios B.

doi:10.48550/arxiv.1906.02351

Cited by 6 publications

(12 citation statements)

References 15 publications

(37 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The complexity bound in Theorem 1 improves at least the dependence over all other SARAH variants for convex problems in the literature. So far, the complexity of SARAH for finding Li et al, 2019a]. 2 Comparing now with the complexity of vanilla SARAH, the bound in Theorem 1 depends on the summation of n + 1 √ rather than their product.…”

Section: Sarah With a Single Regularizermentioning

confidence: 99%

See 1 more Smart Citation

Adaptive Step Sizes in Variance Reduction via Regularization

Li,

Giannakis

2019

Preprint

Self Cite

View full text Add to dashboard Cite

The main goal of this work is equipping convex and nonconvex problems with Barzilai-Borwein (BB) step size. With the adaptivity of BB step sizes granted, they can fail when the objective function is not strongly convex. To overcome this challenge, the key idea here is to bridge (non)convex problems and strongly convex ones via regularization. The proposed regularization schemes are simple yet effective. Wedding the BB step size with a variance reduction method, known as SARAH, offers a free lunch compared with vanilla SARAH in convex problems. The convergence of BB step sizes in nonconvex problems is also established and its complexity is no worse than other adaptive step sizes such as AdaGrad. As a byproduct, our regularized SARAH methods for convex functions ensure that the complexity to find E, improving dependence over existing results. Numerical tests further validate the merits of proposed approaches.

show abstract

Section: Sarah With a Single Regularizermentioning

confidence: 99%

“…It can be seen that we have η min = 1 m(L+λ) ≤ η (s) ≤ 1 mλ = η max . As directly analyzing SARAH require extra assumptions as in [Nguyen et al, 2017[Nguyen et al, , 2018, we will focus on L2S [Li et al, 2019a] equipping with (14).…”

Section: B5 Proof For Corollarymentioning

confidence: 99%

Adaptive Step Sizes in Variance Reduction via Regularization

Li,

Giannakis

2019

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…One idea is to judiciously evaluate a so-termed snapshot gradient ∇f (x s ), and use it as an anchor of the stochastic draws in subsequent iterations. Members of the variance reduction family include schemes abbreviated as SDCA [Shalev-Shwartz and Zhang, 2013], SVRG [Johnson and Zhang, 2013], SAG [Roux et al, 2012], SAGA [Defazio et al, 2014], MISO [Mairal, 2013], SARAH [Nguyen et al, 2017], and their variants [Konecnỳ and Richtárik, 2013, Lei et al, 2017, Li et al, 2019, Kovalev et al, 2019. Most of these algorithms rely on the update x k+1 = x k − ηv k , where η is a constant step size and v k is an algorithm-specific gradient estimate that takes advantage of the snapshot gradient.…”

Section: Introductionmentioning

confidence: 99%

“…Along with auto-tuned BB step sizes, this paper establishes that in order to obtain 'tune-free' SVRG and SARAH schemes, one must: i) develop novel types of gradient averaging adaptive to the chosen step size; and, ii) adjust the inner loop length along with step size as well. Averaging in iterative solvers with reduced variance gradient estimators is effected by the means of choosing the starting point of the next outer loop [Johnson and Zhang, 2013, Tan et al, 2016, Nguyen et al, 2017, Li et al, 2019. The types of averaging considered so far have been employed as tricks to simplify proofs, while in the algorithm itself only the last iteration is selected as starting point of the ensuing outer loop.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Almost Tune-Free Variance Reduction

Li,

Wang,

Giannakis

2019

Preprint

Self Cite

View full text Add to dashboard Cite

The variance reduction class of algorithms including the representative ones, abbreviated as SVRG and SARAH, have well documented merits for empirical risk minimization tasks. However, they require grid search to optimally tune parameters (step size and the number of iterations per inner loop) for best performance. This work introduces 'almost tune-free' SVRG and SARAH schemes by equipping them with Barzilai-Borwein (BB) step sizes. To achieve the best performance, both i) averaging schemes; and, ii) the inner loop length are adjusted according to the BB step size. SVRG and SARAH are first reexamined through an 'estimate sequence' lens. Such analysis provides new averaging methods that tighten the convergence rates of both SVRG and SARAH theoretically, and improve their performance empirically when the step size is chosen large. Then a simple yet effective means of adjusting the number of iterations per inner loop is developed, which completes the tune-free variance reduction together with BB step sizes. Numerical tests corroborate the proposed methods.

show abstract

A Stochastic Proximal Alternating Minimization for Nonsmooth and Nonconvex Optimization

Driggs¹,

Tang²,

Liang³

et al. 2021

SIAM J. Imaging Sci.

View full text Add to dashboard Cite

In this work, we introduce a novel stochastic proximal alternating linearized minimization (PALM) algorithm [6] for solving a class of non-smooth and non-convex optimization problems. Large-scale imaging problems are becoming increasingly prevalent due to the advances in data acquisition and computational capabilities. Motivated by the success of stochastic optimization methods, we propose a stochastic variant of proximal alternating linearized minimization. We provide global convergence guarantees, demonstrating that our proposed method with variancereduced stochastic gradient estimators, such as SAGA [16] and SARAH [27], achieves state-of-the-art oracle complexities. We also demonstrate the efficacy of our algorithm via several numerical examples including sparse non-negative matrix factorization, sparse principal component analysis and blind image deconvolution.

show abstract

On the Convergence of SARAH and Beyond

Cited by 6 publications

References 15 publications

Adaptive Step Sizes in Variance Reduction via Regularization

Adaptive Step Sizes in Variance Reduction via Regularization

Almost Tune-Free Variance Reduction

A Stochastic Proximal Alternating Minimization for Nonsmooth and Nonconvex Optimization

Contact Info

Product

Resources

About