2016
DOI: 10.1137/140954362
|View full text |Cite
|
Sign up to set email alerts
|

A Stochastic Quasi-Newton Method for Large-Scale Optimization

Abstract: The question of how to incorporate curvature information in stochastic approximation methods is challenging. The direct application of classical quasi-Newton updating techniques for deterministic optimization leads to noisy curvature estimates that have harmful effects on the robustness of the iteration. In this paper, we propose a stochastic quasi-Newton method that is efficient, robust and scalable. It employs the classical BFGS update formula in its limited memory form, and is based on the observation that … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

1
323
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 328 publications
(324 citation statements)
references
References 16 publications
1
323
0
Order By: Relevance
“…The quasi-Newton method proposed in [5] uses some subsampled Hessian algorithms via the sample average approximation (SAA) approach to estimate Hessian-vector multiplications. In [6], the authors proposed to use the SA approach instead of SAA to estimate the curvature information. This stochastic quasi-Newton method is based on L-BFGS [26] and performs very well in some problems arising from machine learning, but no theoretical convergence analysis was provided in [6].…”
mentioning
confidence: 99%
See 1 more Smart Citation
“…The quasi-Newton method proposed in [5] uses some subsampled Hessian algorithms via the sample average approximation (SAA) approach to estimate Hessian-vector multiplications. In [6], the authors proposed to use the SA approach instead of SAA to estimate the curvature information. This stochastic quasi-Newton method is based on L-BFGS [26] and performs very well in some problems arising from machine learning, but no theoretical convergence analysis was provided in [6].…”
mentioning
confidence: 99%
“…In [6], the authors proposed to use the SA approach instead of SAA to estimate the curvature information. This stochastic quasi-Newton method is based on L-BFGS [26] and performs very well in some problems arising from machine learning, but no theoretical convergence analysis was provided in [6]. Stochastic quasi-Newton methods based on BFGS and L-BFGS updates were also studied for online convex optimization in Schraudolph et al [42], with no convergence analysis provided, either.…”
mentioning
confidence: 99%
“…This and several other practical issues have been recently addressed in [2]. Finally, another class of extensions to SGD are stochastic quasiNewton methods [6,11]. Despite their clear potential, a lack of theoretical understanding and complicated implementation issues compared to those above may still limit their adoption in the wider community.…”
mentioning
confidence: 99%
“…An interested reader should consults [32,34] for the initial guidance into stochastic optimization problems. Several natural extensions are easily incorporated in the framework considered in this paper, for example search directions with second order information which usually yield faster convergence, but also require additional cost, [8,9,10]. In order to decrease the linear algebra costs, one can consider preconditioners as their construction might be a nontrivial issue due to the presence of random variable.…”
Section: Discussionmentioning
confidence: 99%