2014
DOI: 10.48550/arxiv.1407.1537
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent

Abstract: First-order methods play a central role in large-scale machine learning. Even though many variations exist, each suited to a particular problem, almost all such methods fundamentally rely on two types of algorithmic steps: gradient descent, which yields primal progress, and mirror descent, which yields dual progress.We observe that the performances of gradient and mirror descent are complementary, so that faster algorithms can be designed by linearly coupling the two. We show how to reconstruct Nesterov's acce… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
90
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 45 publications
(90 citation statements)
references
References 18 publications
0
90
0
Order By: Relevance
“…Also both SCSG and SARAH are non-accelerated methods and thus also do not achieve the optimal convergence results. Therefore, much recent research effort has been devoted to the design of accelerated gradient methods (e.g., Nesterov, 2004;Lan, 2012;Allen-Zhu and Orecchia, 2014;Su et al, 2014;Lin et al, 2015;Allen-Zhu, 2017;Lan and Zhou, 2018;Lan et al, 2019;Li et al, 2020b). As can be seen from Table 1, for strongly convex finite-sum problems, existing accelerated methods such as RPDG (Lan and Zhou, 2015), Katyusha (Allen-Zhu, 2017) and Varag (Lan et al, 2019) are optimal since their convergence results are O n + nL µ log 1 matching the lower bound Ω n + nL µ log 1 given by Lan and Zhou (2015).…”
Section: Introductionmentioning
confidence: 99%
“…Also both SCSG and SARAH are non-accelerated methods and thus also do not achieve the optimal convergence results. Therefore, much recent research effort has been devoted to the design of accelerated gradient methods (e.g., Nesterov, 2004;Lan, 2012;Allen-Zhu and Orecchia, 2014;Su et al, 2014;Lin et al, 2015;Allen-Zhu, 2017;Lan and Zhou, 2018;Lan et al, 2019;Li et al, 2020b). As can be seen from Table 1, for strongly convex finite-sum problems, existing accelerated methods such as RPDG (Lan and Zhou, 2015), Katyusha (Allen-Zhu, 2017) and Varag (Lan et al, 2019) are optimal since their convergence results are O n + nL µ log 1 matching the lower bound Ω n + nL µ log 1 given by Lan and Zhou (2015).…”
Section: Introductionmentioning
confidence: 99%
“…Similar accelerated multi-step methods have also been investigated for solving non-smooth problems of the form (1), e.g., [6,18,38,49]. The great theoretical properties as well as empirical performance of such accelerated methods have prompted many authors to try to understand the underlying mechanism and the natural scope of the acceleration concept, e.g., physical momentum, relations to other first-order algorithms as well as geometrical and continuous-time dynamics point of view [1,10,19,27,30,46,50]. Most relevant to the present paper is the result of [1] in which an acceleration scheme can was designed by an appropriate linear coupling of the gradient and mirror descent steps to draw upon their complementary characteristics.…”
Section: Examplementioning
confidence: 99%
“…The intuition behind this algorithm puzzled researchers for decades, and many articles are devoted to understanding the underlying mechanism (Allen-Zhu and Orecchia, 2014;Defazio, 2019;Ahn, 2020) and the role of the small yet crucial modification 5 compared to HB (Flammarion and Bach, 2015;Lessard et al, 2016;Hu and Lessard, 2017). Notwithstanding the theoretical value of these contributions, they are arguably only of a descriptive nature and leave open more fundamental questions on the reason behind acceleration.…”
Section: Introductionmentioning
confidence: 99%