2021
DOI: 10.48550/arxiv.2110.01165
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DESTRESS: Computation-Optimal and Communication-Efficient Decentralized Nonconvex Finite-Sum Optimization

Abstract: Emerging applications in multi-agent environments such as internet-of-things, networked sensing, autonomous systems and federated learning, call for decentralized algorithms for finite-sum optimizations that are resource-efficient in terms of both computation and communication. In this paper, we consider the prototypical setting where the agents work collaboratively to minimize the sum of local loss functions by only communicating with their neighbors over a predetermined network topology. We develop a new alg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 12 publications
(25 reference statements)
0
3
0
Order By: Relevance
“…Among them, gradient tracking (Qu and Li, 2017;Di Lorenzo and Scutari, 2016;Nedic et al, 2017), which applies the idea of dynamic average consensus (Zhu and Martínez, 2010) to global gradient estimation, provides a systematic approach to reduce the variance and has been successfully applied to decentralize many algorithms with faster rates of convergence (Li et al, 2020a;Sun et al, 2019). For nonconvex problems, a small sample of gradient tracking aided algorithms include GT-SAGA (Xin et al, 2021), D-GET (Sun et al, 2020), GT-SARAH (Xin et al, 2020), and DESTRESS (Li et al, 2021a). Our BEER algorithm also leverages gradient tracking to eliminate the strong bounded gradient and bounded dissimilarity assumptions.…”
Section: Assumptionsmentioning
confidence: 99%
See 1 more Smart Citation
“…Among them, gradient tracking (Qu and Li, 2017;Di Lorenzo and Scutari, 2016;Nedic et al, 2017), which applies the idea of dynamic average consensus (Zhu and Martínez, 2010) to global gradient estimation, provides a systematic approach to reduce the variance and has been successfully applied to decentralize many algorithms with faster rates of convergence (Li et al, 2020a;Sun et al, 2019). For nonconvex problems, a small sample of gradient tracking aided algorithms include GT-SAGA (Xin et al, 2021), D-GET (Sun et al, 2020), GT-SARAH (Xin et al, 2020), and DESTRESS (Li et al, 2021a). Our BEER algorithm also leverages gradient tracking to eliminate the strong bounded gradient and bounded dissimilarity assumptions.…”
Section: Assumptionsmentioning
confidence: 99%
“…Note that Theorem 4.2 is a strict generalization of Theorem 4.1, and thus we will directly prove Theorem 4.2. This proof makes use of Lemma B.3 and Lemma B.4, by constructing some proper Lyapunov function and demonstrate its descending property using a linear system argument, which is also used in, e.g.,Li et al (2021a);Liao et al (2021).…”
mentioning
confidence: 99%
“…For instance, Sun et al (2020) employ a scheme with both gradient-tracking and variancereduction to solve a smooth (probably non-convex) problem and show that it converges to a stationary point sublinearly. Li et al (2021) proposed a similar algorithm with a nested loop structure for the sake of improving its overall complexity. Xin et al (2020) and Jiang et al (2022) consider a similar GT-VR framework and obtain a linear rate for strongly convex problems and O (1/k) rate for non-convex setting, respectively.…”
Section: Introductionmentioning
confidence: 99%