2019
DOI: 10.48550/arxiv.1905.05920
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Hybrid Stochastic Gradient Descent Algorithms for Stochastic Nonconvex Optimization

Abstract: We introduce a hybrid stochastic estimator to design stochastic gradient algorithms for solving stochastic optimization problems. Such a hybrid estimator is a convex combination of two existing biased and unbiased estimators and leads to some useful property on its variance. We limit our consideration to a hybrid SARAH-SGD for nonconvex expectation problems. However, our idea can be extended to handle a broader class of estimators in both convex and nonconvex settings. We propose a new single-loop stochastic g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
47
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 17 publications
(48 citation statements)
references
References 16 publications
1
47
0
Order By: Relevance
“…However, it needs to know the smoothness parameter and a bound on the gradient norms to set the step size and momentum parameters. Simultaneously to the work of Cutkosky and Orabona [2019], another paper [Tran-Dinh et al, 2019] have obtained the same optimal bound by proposing a similar update rule. Note that Tran-Dinh et al [2019] does calculate a single anchor point, and it still requires the knowledge of the smoothness and variance parameters.…”
Section: Related Workmentioning
confidence: 69%
See 1 more Smart Citation
“…However, it needs to know the smoothness parameter and a bound on the gradient norms to set the step size and momentum parameters. Simultaneously to the work of Cutkosky and Orabona [2019], another paper [Tran-Dinh et al, 2019] have obtained the same optimal bound by proposing a similar update rule. Note that Tran-Dinh et al [2019] does calculate a single anchor point, and it still requires the knowledge of the smoothness and variance parameters.…”
Section: Related Workmentioning
confidence: 69%
“…In the general case of smooth non-convex objectives it is known that one can approach a stationary point at a rate of O(1/T 1/4 ), where T is the total number of samples [Ghadimi and Lan, 2013]. While this rate is optimal in the general case, it is known that one can obtain an improved rate of O(1/T 1/3 ) if the objective is an expectation over smooth losses [Fang et al, 2018, Zhou et al, 2018, Cutkosky and Orabona, 2019, Tran-Dinh et al, 2019. Besides, this rate was recently shown to be tight [Arjevani et al, 2019].…”
Section: Introductionmentioning
confidence: 99%
“…For the second item at the right side of (27). Be similar to the proof of (25), it is easy to find that…”
Section: Step Size Is Fixedmentioning
confidence: 74%
“…Its cost and difficulty will be improved. Therefore, a Riemannian stochastic recursive gradient algorithm(R-SRG) independent of two distant points is proposed in [14], to avoids the calculation of contraction inverse and makes the calculation efficiency higher.In addition, from [24,25], Riemannian stochastic recursive momentum(R-SRM) algorithm is proposed in [15]. The author considers the linear combination of R-SGDand R-SVRG,and obtained the R-SRM algorithm (the linear combination coefficient and step size of the algorithm are time-vary).…”
Section: Introductionmentioning
confidence: 99%
“…Recently, Cutkosky and Orabona (2019) proposed a recursive momentum-based algorithm called STORM and proved an O( −3 ) gradient complexity to find -stationary points. Tran-Dinh et al (2019) proposed a SARAH-SGD algorithm which hybrids both SGD and SARAH algorithm with an O( −3 ) gradient complexity when is small. Li et al (2020) proposed a PAGE algorithm with probabilistic gradient estimator which also attains an O( −3 ) gradient complexity.…”
Section: Related Workmentioning
confidence: 99%