Compressed Distributed Gradient Descent: Communication-Efficient Consensus over Networks

Zhang, Xin; Liu, Jia; Zhu, Zhengyuan; Bentley, Elizabeth Serena

doi:10.1109/infocom.2019.8737489

Cited by 25 publications

(22 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The resulting global coefficient matrix W ∈ N ×N with entries w kl is right-stochastic and also known as consensus matrix, which has been proven to reach a consistent estimation for all nodes over network [19]. The diffusion GN method aims to enhance the interaction among neighboring nodes by aggregating the real time estimates from neighborhood as (13) and integrating them into local update step as (14), such that the network-wide sensing data is fused by individual node as long as the network is connected.…”

Section: Diffusion Gauss-newton Methods For Node Localizationmentioning

confidence: 99%

A Consensus-Based Diffusion Levenberg-Marquardt Method for Collaborative Localization With Extension to Distributed Optimization

Zhong

et al. 2020

IEEE Access

View full text Add to dashboard Cite

Non-linear least squares problems arise from data fitting have received recently a lot of attention, particularly for the estimates of the model parameters over networked systems. Although the diffusion Gauss-Newton method offers many advantages for solving the non-linear least squares problem in wireless sensor network to estimate target position parameter, there are some key challenges when applying it to practice, including singularity of Gauss-Newton Hessian, selection to constant step sizes and steady state oscillation. These remaining issues lead to obvious performance degradation such as high computational cost, vulnerability to step size change and resulting instability on estimation.In this paper, to eliminate the singularity, we develop a diffusion Levenberg-Marquardt method such that the problem of constant step size is also addressed together. Then, to reach agreement on estimated vector, a consensus implementation is further proposed, thus eliminating the oscillation during steady state. Consequently, the proposed consensus-based diffusion Levenberg-Marquardt method provides a general solution for the non-linear least squares problems with an objective that takes the form of a sum of squared residual terms. By applying to collaborative localization and distributed optimization arise in large scale machine learning, simulation results confirm the effectiveness and wide applicability of proposed method in terms of convergence rate, accuracy and consistency of estimates.

show abstract

Section: Diffusion Gauss-newton Methods For Node Localizationmentioning

confidence: 99%

A Consensus-Based Diffusion Levenberg-Marquardt Method for Collaborative Localization With Extension to Distributed Optimization

Zhong

et al. 2020

IEEE Access

View full text Add to dashboard Cite

show abstract

“…There are four terms in the convergence error of SDM-DSGD in Eq. 7: (I) is the common convergence error that goes to zero as ) and step-size \W increase; (II) is the approximation error between the Lyapunov function + W (x; D) and f (x; D), which decreases with W. These two terms are similar to those in the convergence of DGD-based algorithms [11,28]; (III) and (IV) are the error terms introduced by the compression, random sampling, as well as the Gaussian masking noises. The following simplied convergence rate result follows immediately from Lemma 2.…”

Section: (C) Under Assumption 1mentioning

confidence: 98%

“…R 1. Algorithm 1 is motivated by and bears some similarity with the DGD-type communication-ecient distributed learning in the literature [7,11]. In these existing work, rather than exchanging the states directly, the compressed dierentials between two successive iterations of the variables are communicated to reduce the communication load.…”

Section: The Sdm-dsgd Algorithmmentioning

confidence: 99%

“…To date, most of the existing work on edge ML focused on either communication eciency [7,11,12] or data privacy [13,14]. For the limited amount of work that considered both, they are either only restricted to the server/worker architecture or having unsatisfactory performances and high implementation complexity (see Section 2 for in-depth discussions).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Private and communication-efficient edge learning

Zhang

Fang

Liu

et al. 2020

Proceedings of the Twenty-First International Symposium on Theory, Algorithmic Foundations, and Protocol Design for Mobile Netw

Self Cite

View full text Add to dashboard Cite

With the rise of machine learning (ML) and the proliferation of smart mobile devices, recent years have witnessed a surge of interest in performing ML in wireless edge networks. In this paper, we consider the problem of jointly improving data privacy and communication eciency of distributed edge learning, both of which are critical performance metrics in wireless edge network computing. Toward this end, we propose a new distributed stochastic gradient method with sparse dierential Gaussian-masked stochastic gradients (SDM-DSGD) for non-convex distributed edge learning. Our main contributions are threefold: i) We theoretically establish the privacy and communication eciency performance guarantee for our SDM-DSGD method, which outperforms all existing works; ii) We propose a generalized dierential-coded DSGD update, which enables a much lower transmit probability for gradient sparsication, and provides an$ (1/ p #)) convergence rate; and iii) We reveal theoretical insights and oer practical design guidelines for the interactions between privacy preservation and communication eciency-two conicting performance goals. We conduct extensive experiments with a variety of learning models on MNIST and CIFAR-10 datasets to verify our theoretical ndings. CCS CONCEPTS • Computing methodologies → Distributed algorithms; Machine learning; • Security and privacy; • Networks → Network performance analysis;

show abstract

“…The use cases of such algorithms are internet-of-things, peer-topeer (P2P) network sharing, vehicular communication, sensor networks, edge devices, etc [17], [18]. There exist several works applying decentralized optimization methods for neural networks without a master node [19]- [23]. For instance, the work [19] uses amplified-differential compression for computing the gradient in a synchronous manner and enjoys low computational complexity.…”

Section: Introductionmentioning

confidence: 99%

Asynchronous Decentralized Learning of Randomization-Based Neural Networks

Liang

Javid

Skoglund

et al. 2021

2021 International Joint Conference on Neural Networks (IJCNN)

View full text Add to dashboard Cite

In a communication network, decentralized learning refers to the knowledge collaboration between the different local agents (processing nodes) to improve the local estimation performance without sharing private data. The ideal case is that the decentralized solution approximates the centralized solution, as if all the data are available at a single node, and requires low computational power and communication overhead. In this work, we propose a decentralized learning of randomization-based neural networks with asynchronous communication and achieve centralized equivalent performance. We propose an ARock-based alternating-direction-method-of-multipliers (ADMM) algorithm that enables individual node activation and one-sided communication in an undirected connected network, characterized by a doubly-stochastic network policy matrix. Besides, the proposed algorithm reduces the computational cost and communication overhead due to its asynchronous nature. We study the proposed algorithm on different randomization-based neural networks, including ELM, SSFN, RVFL, and its variants, to achieve the centralized equivalent performance under efficient computation and communication costs. We also show that the proposed asynchronous decentralized learning algorithm can outperform a synchronous learning algorithm regarding computational complexity, especially when the network connections are sparse.Index Terms-decentralized learning, neural networks, asynchronous communication, ADMM

show abstract

Compressed Distributed Gradient Descent: Communication-Efficient Consensus over Networks

Cited by 25 publications

References 28 publications

A Consensus-Based Diffusion Levenberg-Marquardt Method for Collaborative Localization With Extension to Distributed Optimization

A Consensus-Based Diffusion Levenberg-Marquardt Method for Collaborative Localization With Extension to Distributed Optimization

Private and communication-efficient edge learning

Asynchronous Decentralized Learning of Randomization-Based Neural Networks

Contact Info

Product

Resources

About