BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression

Zhao, Hongbin; Li, Boyue; Li, Zhize; Richtárik, Peter; Chi, Yuejie

doi:10.48550/arxiv.2201.13320

Cited by 2 publications

(2 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(2) primal-dual like methods [22,17,15,14,54,51]; (3) gradient tracking based algorithms [20,55,37,50,47].…”

Section: Related Workmentioning

confidence: 99%

CEDAS: A Compressed Decentralized Stochastic Gradient Method with Improved Convergence

Huang¹,

Pu²

2023

Preprint

View full text Add to dashboard Cite

In this paper, we consider solving the distributed optimization problem over a multi-agent network under the communication restricted setting. We study a compressed decentralized stochastic gradient method, termed "compressed exact diffusion with adaptive stepsizes (CEDAS)", and show the method asymptotically achieves comparable convergence rate as centralized SGD for both smooth strongly convex objective functions and smooth nonconvex objective functions under unbiased compression operators. In particular, to our knowledge, CEDAS enjoys so far the shortest transient time (with respect to the graph specifics) for achieving the convergence rate of centralized SGD, which behaves as O nC 3 /(1 − λ 2 ) 2 under smooth strongly convex objective functions, and O n 3 C 6 /(1 − λ 2 ) 4 under smooth nonconvex objective functions, where (1 − λ 2 ) denotes the spectral gap of the mixing matrix, and C > 0 is the compression-related parameter. Numerical experiments further demonstrate the effectiveness of the proposed algorithm.

show abstract

“…(2) primal-dual like methods [22,17,15,14,54,51]; (3) gradient tracking based algorithms [20,55,37,50,47].…”

Section: Related Workmentioning

confidence: 99%

CEDAS: A Compressed Decentralized Stochastic Gradient Method with Improved Convergence

Huang¹,

Pu²

2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Therefore, it is very important to design FL algorithms to reduce the overall communication cost, which takes into account both the number of communication rounds and the cost per communication round for reaching a desired accuracy. With these two quantities in mind, there are two principal approaches for communication-efficient FL: 1) local methods, where in each communication round, clients run multiple local update steps before communicating with the server, in the hope of reducing the number of communication rounds, e.g., [47,43,36,24,35,61,51,2,67,50,49,42]; 2) compression methods, where clients send compressed communication message to the server, in the hope of reducing the cost per communication round, e.g., [4,37,60,28,34,48,52,25,53,19,41,68]. While both categories have garnered significant attention in recent years, we will focus on the second approach based on communication compression to enhance communication efficiency.…”

Section: Motivation: Privacy-utility-communication Trade-offsmentioning

confidence: 99%

SoteriaFL: A Unified Framework for Private Federated Learning with Communication Compression

Li¹,

Zhao²,

Li³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

To enable large-scale machine learning in bandwidth-hungry environments such as wireless networks, significant progress has been made recently in designing communication-efficient federated learning algorithms with the aid of communication compression. On the other end, privacy-preserving, especially at the client level, is another important desideratum that has not been addressed simultaneously in the presence of advanced communication compression techniques yet. In this paper, we propose a unified framework that enhances the communication efficiency of private federated learning with communication compression. Exploiting both general compression operators and local differential privacy, we first examine a simple algorithm that applies compression directly to differentially-private stochastic gradient descent, and identify its limitations. We then propose a unified framework SoteriaFL for private federated learning, which accommodates a general family of local gradient estimators including popular stochastic variance-reduced gradient methods and the state-of-the-art shifted compression scheme. We provide a comprehensive characterization of its performance trade-offs in terms of privacy, utility, and communication complexity, where SoteriaFL is shown to achieve better communication complexity without sacrificing privacy nor utility than other private federated learning algorithms without communication compression.

show abstract

BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression

Cited by 2 publications

References 22 publications

CEDAS: A Compressed Decentralized Stochastic Gradient Method with Improved Convergence

CEDAS: A Compressed Decentralized Stochastic Gradient Method with Improved Convergence

SoteriaFL: A Unified Framework for Private Federated Learning with Communication Compression

Contact Info

Product

Resources

About