2022
DOI: 10.1109/tac.2022.3180695
|View full text |Cite
|
Sign up to set email alerts
|

A Compressed Gradient Tracking Method for Decentralized Optimization With Linear Convergence

Abstract: Communication compression techniques are of growing interests for solving the decentralized optimization problem under limited communication, where the global objective is to minimize the average of local cost functions over a multiagent network using only local computation and peer-to-peer communication. In this article, we propose a novel compressed gradient tracking algorithm (C-GT) that combines gradient tracking technique with communication compression. In particular, C-GT is compatible with a general cla… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(5 citation statements)
references
References 44 publications
(85 reference statements)
0
5
0
Order By: Relevance
“…It is worthwhile to point out that the proposed LSGT and MUST methods, to the best of our knowledge, are the first stochastic GT algorithm with multiple local SGD updates for decentralized learning. In the future, we plan to extend this framework to time-varying communication network topologies [35] and that with compression [36].…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…It is worthwhile to point out that the proposed LSGT and MUST methods, to the best of our knowledge, are the first stochastic GT algorithm with multiple local SGD updates for decentralized learning. In the future, we plan to extend this framework to time-varying communication network topologies [35] and that with compression [36].…”
Section: Discussionmentioning
confidence: 99%
“…so that the coefficient of the last term in the RHS of ( 35) is negative and the term can be removed. In addition, we have used the properties of λ 2 w < 1 and E ≥ 1 to obtain bounds for coefficients of the 3rd and 4th terms in the RHS of (36). Finally, after dividing T γE 4 on both sides of (36), we obtain the results in Theorem 1.…”
Section: Proof Of Theoremmentioning
confidence: 99%
“…In recent years, various techniques have been developed to reduce communication costs [27,31]. They are extensively incorporated into centralized optimization methods [1,23,30] and decentralized methods [8,9,18,20,33]. This motivates us to provide an extension of D-ASCGD by combining it with communication compressed method, which reads as follows.…”
Section: Compressed D-ascgd Methodsmentioning
confidence: 99%
“…For vector x ∈ R d or z ∈ R p , the class of compressors satisfying (28) or ( 29) is broad, such as random quantization [23,26], sparsification [15,19], the norm-sign compressor [18,33]. For matrix y ∈ R d×p , we may construct the dp-length vector by stacking up the columns of y and then implement predetermined compressors to compress the new constructed vector.…”
Section: And Compute Function Valuesmentioning
confidence: 99%
“…For another example, let f i (x) := (1/|S i |) ζj ∈Si F (x, ζ j ) denote an empirical risk function, where S i is the local dataset of agent i. Under this setting, the gradient estimation of f i (x) can incur noise from various sources such as sampling and discretization [15]. The second condition in (2) holds for many popular problems such as linear regression, smooth SVMs, logistic regression, and softmax regression [16].…”
Section: Introductionmentioning
confidence: 99%