2017
DOI: 10.48550/arxiv.1704.00116
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Stochastic L-BFGS: Improved Convergence Rates and Practical Acceleration Strategies

Renbo Zhao,
William B. Haskell,
Vincent Y. F. Tan

Abstract: We revisit the stochastic limited-memory BFGS (L-BFGS) algorithm. By proposing a new coordinate transformation framework for the convergence analysis, we prove improved convergence rates and computational complexities of the stochastic L-BFGS algorithms compared to previous works. In addition, we propose several practical acceleration strategies to speed up the empirical performance of such algorithms. We also provide theoretical analyses for most of the strategies. Experiments on large-scale logistic and ridg… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 19 publications
0
2
0
Order By: Relevance
“…Parallel Asy VR High dimensional Convergence [22] sublinear [25] linear [23] linear [26] linear [29] parallel two-loop recursion − [30] map reduce for gradient sublinear [31] map reduce for gradient linear [32,33] parallel calculation for gradient − [34] parallel calculation for Hessian superlinear AsySQN parallel model for L-BFGS linear sion can be calculated fast. As a successful trial to create both stochastic and parallel algorithms, multi-batch L-BFGS [30] uses map-reduce to compute both gradients and updating rules for L-BFGS.…”
Section: Qn Methods Stochasticmentioning
confidence: 99%
See 1 more Smart Citation
“…Parallel Asy VR High dimensional Convergence [22] sublinear [25] linear [23] linear [26] linear [29] parallel two-loop recursion − [30] map reduce for gradient sublinear [31] map reduce for gradient linear [32,33] parallel calculation for gradient − [34] parallel calculation for Hessian superlinear AsySQN parallel model for L-BFGS linear sion can be calculated fast. As a successful trial to create both stochastic and parallel algorithms, multi-batch L-BFGS [30] uses map-reduce to compute both gradients and updating rules for L-BFGS.…”
Section: Qn Methods Stochasticmentioning
confidence: 99%
“…Using the variance reduction (VR) technique proposed in [24], the convergence rate can be lifted up to linear in the latest attempts [25,23]. Later, acceleration strategies [26] are combined with VR, non-uniform mini-batch subsampling, momentum calculation to derive a fast and practical stochastic algorithm. Another line of stochastic quasi-Newton studies tries to focus on solving self-concordant functions, which requires more on the shape or property of objective functions, can reach a linear convergence rate.…”
Section: Introductionmentioning
confidence: 99%