2022
DOI: 10.36227/techrxiv.19522414
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Survey on Auto-Parallelism of Neural Networks Training

Abstract: The datas in this survey were collected from reference papers.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 17 publications
0
2
0
Order By: Relevance
“…Although it costs about 1/3 more arithmetic overheads, this method could save significant memory footprint and make it possible to train models with larger data microbatches and preserve more model parameters on each device. Activation recomputation is widely adopted [3,11,24], especially in PMP approaches, where workers might manage a large number of activations simultaneously. Merak employs and fine-tunes it to pursue higher memory efficiency.…”
Section: Systemmentioning
confidence: 99%
See 1 more Smart Citation
“…Although it costs about 1/3 more arithmetic overheads, this method could save significant memory footprint and make it possible to train models with larger data microbatches and preserve more model parameters on each device. Activation recomputation is widely adopted [3,11,24], especially in PMP approaches, where workers might manage a large number of activations simultaneously. Merak employs and fine-tunes it to pursue higher memory efficiency.…”
Section: Systemmentioning
confidence: 99%
“…And we handle outputs of current subgraph and increase subgraph id if we create a new subgraph (lines [15][16][17][18][19][20]. After traversing all nodes, we create subgraphs with corresponding nodes, inputs and outputs (lines [23][24][25][26][27][28]. Inputs of each subgraph include all outputs of the last subgraph, thus the subgraph list could be executed as a sequence.…”
Section: Graph Sharding Algorithmmentioning
confidence: 99%