2019 IEEE 4th International Workshops on Foundations and Applications of Self* Systems (FAS*W) 2019
DOI: 10.1109/fas-w.2019.00050
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Large-Scale Deep Learning Framework for Heterogeneous Multi-GPU Cluster

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4

Relationship

2
7

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 1 publication
0
7
0
Order By: Relevance
“…In [25], a distributed deep learning framework for a heterogeneous multi-GPU cluster combines the advantages of All-reduce and parameter-server methods. In addition, the proposed design performs significant mini-batch training asynchronously to increase the overall utilization of available computing power in the cluster.…”
Section: Related Workmentioning
confidence: 99%
“…In [25], a distributed deep learning framework for a heterogeneous multi-GPU cluster combines the advantages of All-reduce and parameter-server methods. In addition, the proposed design performs significant mini-batch training asynchronously to increase the overall utilization of available computing power in the cluster.…”
Section: Related Workmentioning
confidence: 99%
“…The distributed deep learning methods can be two-fold: data parallelization and model parallelization. In the case of data parallelization, the deep learning model is replicated across multiple workers [27,28]. Each worker trains a deep learning model with different input data.…”
Section: Distributed Deep Learningmentioning
confidence: 99%
“…To achieve high accuracy, it is necessary to use large deep learning models and large datasets are needed to improve the generalization capabilities of the models. Training large-scale models with massive datasets is difficult due to the limited GPU memory size [1][2][3][4][5][6]. Distributed deep learning using multi-GPU/node can efficiently train large-scale models.…”
Section: Introductionmentioning
confidence: 99%