2022
DOI: 10.1137/21m1450677
|View full text |Cite
|
Sign up to set email alerts
|

DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 20 publications
0
5
0
Order By: Relevance
“…Sun, Lu, and Hong (2020) provided the first decentralized stochastic algorithm, D-GET, combining variance reduction and gradient tracking. Li, Li, and Chi (2022); Xin, Khan, and Kar (2022) further proposed algorithms with improved complexity bound. Recently, DEAREST (Luo and Ye 2022) is the first decentralized stochastic algorithm that achieves both optimal computation and communication complexity.…”
Section: Decentralized Stochastic First-order Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Sun, Lu, and Hong (2020) provided the first decentralized stochastic algorithm, D-GET, combining variance reduction and gradient tracking. Li, Li, and Chi (2022); Xin, Khan, and Kar (2022) further proposed algorithms with improved complexity bound. Recently, DEAREST (Luo and Ye 2022) is the first decentralized stochastic algorithm that achieves both optimal computation and communication complexity.…”
Section: Decentralized Stochastic First-order Methodsmentioning
confidence: 99%
“…Convergence analysis of these works does not apply to Problem (1) due to the mismatch of problem assumptions. The other class of methods (Li, Li, and Chi 2022;Luo and Ye 2022;Xin, Khan, and Kar 2022) assumes that component functions f i,j (•) are nonconvex and the global objective function f (•) is also possibly nonconvex. Consequently, the rate achieved by these methods is not optimal for Problem (1).…”
Section: Introductionmentioning
confidence: 99%
“…More recently, gradient tracking has been utilized to further enhance the convergence rate of new methods; see (Lu et al 2019;Zhang and You 2020;Koloskova, Lin, and Stich 2021;Xin, Khan, and Kar 2021b) for further discussions. Variance reduction methods that mimic updates from the SARAH (Nguyen et al 2017b) and SPIDER (Wang et al 2019) methods provide optimal gradient complexity results at the expense of large batch computations; examples include D-SPIDER-SFO (Pan, Liu, and Wang 2020), D-GET (Sun, Lu, and Hong 2020), GT-SARAH (Xin, Khan, and Kar 2022), DE-STRESS (Li, Li, and Chi 2022). To avoid the large batch requirement of these methods, the STORM (Cutkosky and Orabona 2019; Xu and Xu 2023) and Hybrid-SGD (Tran-Dinh et al 2022a) methods have also been adapted to the decentralized setting; see GT-STORM (Zhang et al 2021b) and GT-HSGD (Xin, Khan, and Kar 2021a).…”
Section: Related Workmentioning
confidence: 99%
“…Sun et al [32] first applied variance reduction and gradient tracking to decentralized nonconvex finite-sum optimization and proposed the algorithm called Decentralized Gradient Estimation and Tracking (D-GET). Later, Xin et al [38] proposed GT-SARAH, which improved the complexity of D-GET in terms of the dependency on m and n. Li et al [18] further improved the result of GT-SARAH by proposing DEcentralized STochastic REcurSive gradient methodS (DESTRESS), which requires O n + n/mLε −2 per-agent IFO calls and O √ mn…”
Section: Algorithmsmentioning
confidence: 99%
“…, x d ] ⊤ ∈ R d is the vector of the classifier. We compare the proposed DEAREST with baseline algorithms GT-SARAH [37] and DESTRESS [18] on three real-world datasets "a9a" (mn = 32, 560, d = 123), "w8a" (mn = 49, 740, d = 300) and "rcv1" (mn = 20, 420, d = 47, 236). All of the datasets can be downloaded from LIBSVM repository [6].…”
Section: Numerical Experimentsmentioning
confidence: 99%