2020 IEEE High Performance Extreme Computing Conference (HPEC) 2020
DOI: 10.1109/hpec43674.2020.9286180
|View full text |Cite
|
Sign up to set email alerts
|

Layer-Parallel Training with GPU Concurrency of Deep Residual Neural Networks via Nonlinear Multigrid

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(12 citation statements)
references
References 18 publications
0
8
0
Order By: Relevance
“…In the pursuit for increased efficiency, several works have proposed approaches to parallelization across time or depth in neural networks. (Gunther et al, 2020;Kirby et al, 2020;Sun et al, 2020) use multigrid and penalty methods to achieve speedups in ResNets. Meng et al (2020) proposed a parareal variant of Physics-informed neural networks (PINNs) for PDEs.…”
Section: Time-parallelization In Neural Modelsmentioning
confidence: 99%
“…In the pursuit for increased efficiency, several works have proposed approaches to parallelization across time or depth in neural networks. (Gunther et al, 2020;Kirby et al, 2020;Sun et al, 2020) use multigrid and penalty methods to achieve speedups in ResNets. Meng et al (2020) proposed a parareal variant of Physics-informed neural networks (PINNs) for PDEs.…”
Section: Time-parallelization In Neural Modelsmentioning
confidence: 99%
“…However, as shown in (Gotmare et al, 2018), the performances of most of these methods are much worse than that of BP for deep convolutional neural networks. On the other hand, based on the similarity of ResNets training to the optimal control of nonlinear systems (E, 2017), the parareal method for solving differential equations is employed to replace the conventional forwardbackward propagation with iterative multigrid schemes (Günther et al, 2020;Parpas and Muir, 2019;Kirby et al, 2020). Although the locking issues can be resolved, the implementation is complicated and difficult to integrate with the existing library technologies such as BP and automatic differentiation.…”
Section: Related Workmentioning
confidence: 99%
“…Although the locking issues can be resolved, the implementation is complicated and difficult to integrate with the existing library technologies such as BP and automatic differentiation. Therefore, experiments were conducted on the simple ResNets across small datasets (Kirby et al, 2020), rather than the state-of-the-art ResNets across larger datasets.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, Wu et al [71] proposed a multilevel training for video sequences. The multilevel methods were also explored in the context of layer-parallel training in References [34,47]. Let us note eventually that a variant of the multilevel line-search method was presented in Reference [23].…”
mentioning
confidence: 99%