2020
DOI: 10.1007/s11227-020-03200-6
|View full text |Cite
|
Sign up to set email alerts
|

Training deep neural networks: a static load balancing approach

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 13 publications
(8 citation statements)
references
References 12 publications
0
3
0
Order By: Relevance
“…These techniques have been extrapolated to develop similar versions for distributed neural networks. In this context, computation awareness has been improved for data parallelism [4] and model parallelism [29]. As a consequence, partitioning the workload between resources from different nodes could produce communication bottlenecks that should be handled.…”
Section: Related Workmentioning
confidence: 99%
“…These techniques have been extrapolated to develop similar versions for distributed neural networks. In this context, computation awareness has been improved for data parallelism [4] and model parallelism [29]. As a consequence, partitioning the workload between resources from different nodes could produce communication bottlenecks that should be handled.…”
Section: Related Workmentioning
confidence: 99%
“…However, in the case of GPU workers, batch processing time and batch size are not proportional, so the batch size is determined by numerical approximation using a static step size to adjust the batch size. A study that allocates a batch size proportional to the performance of each computing node using a static loadbalancing technique has been presented [32]. BOA (Batch Orchestration Algorithm) adaptively adjusts the batch size according to the worker's speed to alleviate both static and dynamic stragglers [10].…”
Section: Straggler Mitigationmentioning
confidence: 99%
“…In addition, there is a synchronization waiting problem during model training. 10 Model parallelism 11,12 divides a specific layer of tensor and processes a specific layer of a network by multiple computing nodes or processes simultaneously to split a large network layer into multiple relatively small tensor parallel computations. This approach does not require loading the entire model into edge nodes, which facilitates the training of larger models.…”
Section: Related Workmentioning
confidence: 99%
“…In data parallelism, nodes need to maintain the entire model parameters, and large convolutional networks cannot be loaded in edge devices. In addition, there is a synchronization waiting problem during model training 10 . Model parallelism 11,12 divides a specific layer of tensor and processes a specific layer of a network by multiple computing nodes or processes simultaneously to split a large network layer into multiple relatively small tensor parallel computations.…”
Section: Related Workmentioning
confidence: 99%