Proceedings of the 2019 ACM SIGPLAN International Symposium on Memory Management 2019
DOI: 10.1145/3315573.3329984
|View full text |Cite
|
Sign up to set email alerts
|

Automatic GPU memory management for large neural models in TensorFlow

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
4
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(5 citation statements)
references
References 4 publications
0
4
0
Order By: Relevance
“…This can prevent the model from escaping suboptimal regions in the loss landscape. Another issue worth considering is that a smaller batch size may impede the training process, primarily on hardware optimized for large batch sizes, and a large batch size will cause out-of-memory errors on systems that have low GPU memory [81,82]. By carefully considering these parameters, we developed an efficient and accurate model for spider identification.…”
Section: Discussionmentioning
confidence: 99%
“…This can prevent the model from escaping suboptimal regions in the loss landscape. Another issue worth considering is that a smaller batch size may impede the training process, primarily on hardware optimized for large batch sizes, and a large batch size will cause out-of-memory errors on systems that have low GPU memory [81,82]. By carefully considering these parameters, we developed an efficient and accurate model for spider identification.…”
Section: Discussionmentioning
confidence: 99%
“…It is limited to the interaction between the processor and memory leaving aside accelerator devices such as the GPU. In order to solve or significantly reduce this problem, methods have emerged such as the optimization of the directed asynchronous graph created at the time of the execution of the model, as proposed by Le [15] and Boemer [3] or, methods of working with sparses matrices [20] to reduce memory consumption during training processes. They have also considered changing the way, memory levels are used during training as mentioned in Rhu [21] and Lim [17] works but they also present bottlenecks due to the high level of communication that occurs through the PCIe bus.…”
Section: Discussionmentioning
confidence: 99%
“…All of the above have forced accelerator designers for deep neural networks to use highcost memory solutions such as HBM (High Bandwidth Memory) used in Google TPUs [11]. Other solutions have been proposed to overcome these limitations such as the development of new techniques to improve training by working directly on the neural network graph [3,15] or working with sparse matrices [20]. Designing specialized dense nodes for the effective use of accelerators has also been done.…”
Section: Related Workmentioning
confidence: 99%
“…This results into a √ n memory usage of a neural network model with n nodes. Another strategy is via GPU memory management [43], where inactive tensors are automatically transferred from GPUs to the host and vice versa. This is transparent to users and the added performance penalty is often tolerable.…”
Section: Summary and Discussionmentioning
confidence: 99%