2016
DOI: 10.48550/arxiv.1611.00756
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Accelerated Methods for Non-Convex Optimization

Abstract: We present an accelerated gradient method for non-convex optimization problems with Lipschitz continuous first and second derivatives. The method requires time O( −7/4 log(1/ )) to find an -stationary point, meaning a point x such that ∇f (x) ≤ . The method improves upon the O( −2 ) complexity of gradient descent and provides the additional second-order guarantee that ∇ 2 f (x) −O( 1/2 )I for the computed x. Furthermore, our method is Hessian-free, i.e. it only requires gradient computations, and is therefore … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
76
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 34 publications
(80 citation statements)
references
References 30 publications
1
76
0
Order By: Relevance
“…Reference Oracle Iterations Simplicity Non-stochastic [1,6] Hessian-vector product Õ(log n/ It is worth highlighting that our gradient-descent based algorithm enjoys the following nice features:…”
Section: Settingmentioning
confidence: 99%
See 2 more Smart Citations
“…Reference Oracle Iterations Simplicity Non-stochastic [1,6] Hessian-vector product Õ(log n/ It is worth highlighting that our gradient-descent based algorithm enjoys the following nice features:…”
Section: Settingmentioning
confidence: 99%
“…On the contrary, algorithms with nested loops often suffer from significant overheads in large scales, or introduce concerns with the setting of hyperparameters and numerical stability (see e.g. [1,6]), making them relatively hard to find practical implementations.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Momentum acceleration methods are used regularly in the convex setting, as well as in machine learning practical scenarios [50,87,50,11,68,15,32]. While momentum acceleration was previously studied in nonconvex programming setups, it mostly involve non-convex constraints with a convex objective function [52,53,49,97]; and generic non-convex settings but only considering with the question of whether momentum acceleration leads to fast convergence to a saddle point or to a local minimum, rather than to a global optimum [31,56,18,4].…”
Section: Related Workmentioning
confidence: 99%
“…By utilizing second-order information, one can obtain improved rate of convergence to approximate local minima. This includes approaches based on Nesterov and Polyak's cubic regularization [1,18,27], or first-order method with accelerated gradient method as a sub-solver for escaping saddle points [2].…”
Section: Introductionmentioning
confidence: 99%