2020 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGRID) 2020
DOI: 10.1109/ccgrid49817.2020.00-40
|View full text |Cite
|
Sign up to set email alerts
|

Standard Deviation Based Adaptive Gradient Compression For Distributed Deep Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 13 publications
0
2
0
Order By: Relevance
“…Gradient sparsification can achieve a higher compression rate than gradient quantization, but it can seriously affect the convergence and accuracy of the model. The standard deviation-based adaptive gradient compression (SDAGC) method is proposed in [125], which can achieve higher model performance in simultaneous training.…”
Section: A Communication Costmentioning
confidence: 99%
“…Gradient sparsification can achieve a higher compression rate than gradient quantization, but it can seriously affect the convergence and accuracy of the model. The standard deviation-based adaptive gradient compression (SDAGC) method is proposed in [125], which can achieve higher model performance in simultaneous training.…”
Section: A Communication Costmentioning
confidence: 99%
“…To have a reasonable training time, researchers have proposed various techniques. For example, there are many approaches for speeding up the training procedure by improving scalability, such as large-batch training [6], [7], exploiting different forms of parallelism [8], asynchronous training [9], reducing communication during training [10], [11], and so on. Other approaches focus on the statistical efficiency of optimization algorithms to reduce the number of training iterations, such as AdaGrad [12], Adam [13], AdamW [14] and variance-reduced SGD [15], [16].…”
Section: Introductionmentioning
confidence: 99%