Dual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning

Yan, Zijie; Xiao, Dongjie; Chen, Mengqiang; Zhou, Jing; Wu, Weigang

doi:10.1145/3404397.3404401

Cited by 5 publications

(1 citation statement)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Shi et al 17 proposed a novel global Top‐k sparsification mechanism to address the difficulty of aggregating sparse gradients, which alleviates the network pressure. Yan et al 18 proposed a dual‐way gradient sparsification approach to reduce communication cost, in which workers merely download model differences from the parameter server. Abdi et al 5 presented a quantized compressive sampling for the compression of stochastic gradients or parameters, which employs both a dither quantization and compressive sensing to achieve arbitrarily large compression gains.…”

Section: Relative Workmentioning

confidence: 99%

Intra‐cluster aggregation aware routing for distributed training in wireless sensor networks

Chen

Long

Chen

et al. 2021

Concurrency and Computation

View full text Add to dashboard Cite

In wireless sensor networks (WSNs), wireless sensor nodes can be equipped with deep neural network accelerators to deal with the computation challenges in distributed training. However, the communication overhead of distributed training and the limited battery capacity of sensor nodes still impedes the broad deployment of distributed training applications. This article investigates the distributed training in WSNs by formulating an aggregation-aware routing problem into a non-linear integer programming problem. The objective of the formulated problem is to reduce the training time using data aggregation-aware routing under the constraints of memory size and energy cost. Meanwhile, the NP-Hardness of the formulated problem is proved in this article. Then, an intra-cluster aggregation-aware routing algorithm is proposed. The proposed algorithm accelerates the transmission of the data packet by integrating the K-Means clustering and shortest path routing to choose the aggregators and the route paths. Extensive experiments demonstrate that the proposed algorithm outperforms two classical clustering routing algorithms UC-LEACH and K-Means by 29% and 37% in terms of average training time, and reducing the energy consumption by 21% and 15%, respectively.

show abstract

Section: Relative Workmentioning

confidence: 99%