2019
DOI: 10.1109/tsp.2019.2926022
|View full text |Cite
|
Sign up to set email alerts
|

A Decentralized Proximal-Gradient Method With Network Independent Step-Sizes and Separated Convergence Rates

Abstract: This paper considers the problem of decentralized optimization with a composite objective containing smooth and non-smooth terms. To solve the problem, a proximal-gradient scheme is studied. Specifically, the smooth and nonsmooth terms are dealt with by gradient update and proximal update, respectively. The studied algorithm is closely related to a previous decentralized optimization algorithm, PG-EXTRA [37], but has a few advantages. First of all, in our new scheme, agents use uncoordinated step-sizes and the… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

3
193
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 196 publications
(196 citation statements)
references
References 61 publications
3
193
0
Order By: Relevance
“…Although the update (3b) has two gradient evaluations, they are evaluated at successive iterates so EXTRA can easily be implemented with one gradient evaluation per iteration by storing the previous gradient in memory. Several additional linear-rate algorithms have since been proposed [9,13,16,21,36,38,39]. Each of these methods have updates similar to (3) in that they require agents to store the previous iterate and/or gradient in memory.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…Although the update (3b) has two gradient evaluations, they are evaluated at successive iterates so EXTRA can easily be implemented with one gradient evaluation per iteration by storing the previous gradient in memory. Several additional linear-rate algorithms have since been proposed [9,13,16,21,36,38,39]. Each of these methods have updates similar to (3) in that they require agents to store the previous iterate and/or gradient in memory.…”
Section: Introductionmentioning
confidence: 99%
“…Other distributed optimization algorithms solve (1) but lie outside the scope of the present work. This includes algorithms involving dual decomposition [4,24,25], inexact dual methods [7], proximal algorithms [27], asynchronous algorithms [15,37], weakly convex cases [13,20,26], accelerated methods [20,33,34]. Although linear convergence rates were obtained for many of the algorithms cited above, each algorithm differs in the nature and strength of its convergence analysis guarantees.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Our focus is on the design of distributed algorithms for Problem (P) that provably converge at a linear rate. When G = 0, several distributed schemes have been proposed in the literature enjoying such a property; examples include EXTRA [1], AugDGM [2], NEXT [3], SONATA [4], [5], DIGing [6], NIDS [7], Exact Diffusion [8], MSDA [9], and the distributed algorithms in [10], [11], and [12]. When G = 0 results are scarce; to our knowledge, the only two schemes available in the literature achieving linear rate for (P) are SONATA [5] and the distributed proximal gradient algorithm [13].…”
Section: Introductionmentioning
confidence: 99%
“…Because of that, in general, they cannot achieve the rate of the centralized gradient algorithm (addressing thus Q2). Works partially addressing Q2 are the following: MSDA [9] uses multiple communication steps to achieve the lower complexity bound of (P) when G = 0; and the algorithms in [16] and [7] achieve linear rate and can adjust the number of communications performed at each iteration to match the rate of the centralized gradient descent. However it is not clear how to extend (if possible) these methods and their convergence analysis to the more general composite (i.e., G = 0) setting (P).…”
Section: Introductionmentioning
confidence: 99%