49th International Conference on Parallel Processing - ICPP 2020
DOI: 10.1145/3404397.3404401
|View full text |Cite
|
Sign up to set email alerts
|

Dual-Way Gradient Sparsification for Asynchronous Distributed Deep Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 11 publications
0
1
0
Order By: Relevance
“…Shi et al 17 proposed a novel global Top‐k sparsification mechanism to address the difficulty of aggregating sparse gradients, which alleviates the network pressure. Yan et al 18 proposed a dual‐way gradient sparsification approach to reduce communication cost, in which workers merely download model differences from the parameter server. Abdi et al 5 presented a quantized compressive sampling for the compression of stochastic gradients or parameters, which employs both a dither quantization and compressive sensing to achieve arbitrarily large compression gains.…”
Section: Relative Workmentioning
confidence: 99%
“…Shi et al 17 proposed a novel global Top‐k sparsification mechanism to address the difficulty of aggregating sparse gradients, which alleviates the network pressure. Yan et al 18 proposed a dual‐way gradient sparsification approach to reduce communication cost, in which workers merely download model differences from the parameter server. Abdi et al 5 presented a quantized compressive sampling for the compression of stochastic gradients or parameters, which employs both a dither quantization and compressive sensing to achieve arbitrarily large compression gains.…”
Section: Relative Workmentioning
confidence: 99%