Proceedings of the 2022 International Conference on Management of Data 2022
DOI: 10.1145/3514221.3517902
|View full text |Cite
|
Sign up to set email alerts
|

HET-GMP: A Graph-based System Approach to Scaling Large Embedding Model Training

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(3 citation statements)
references
References 31 publications
0
1
0
Order By: Relevance
“…Angel-PTM offers a comprehensive solution for efficient deep learning model training in industrial settings. It leverages some key techniques [30,33,38] from Hetu [31], gets implemented over PyTorch [40], and features the Page abstraction for memory efficiency and a unified scheduling method for resource utilization. Furthermore, Angel-PTM has undergone extensive optimization on A100 servers, enabling it to take full advantage of hardware capabilities for deep learning tasks.…”
Section: Methodsmentioning
confidence: 99%
“…Angel-PTM offers a comprehensive solution for efficient deep learning model training in industrial settings. It leverages some key techniques [30,33,38] from Hetu [31], gets implemented over PyTorch [40], and features the Page abstraction for memory efficiency and a unified scheduling method for resource utilization. Furthermore, Angel-PTM has undergone extensive optimization on A100 servers, enabling it to take full advantage of hardware capabilities for deep learning tasks.…”
Section: Methodsmentioning
confidence: 99%
“…At the same time, a single clicking log may contain only hundreds of non-zero entries. As a result, when we create the embedding for each feature, the whole embedding layer can be extremely large, and the parameters of the CTR prediction model are dominated (e.g., 99.9%) by the embedding part instead of the deep network part (Miao et al 2021;Ginart et al 2021). Table 1 shows the case under our experimental setting.…”
Section: Related Workmentioning
confidence: 99%
“…Datasets. We evaluate our algorithms on the following public datasets which are widely adopted by the community (Cheng et al 2016;Li et al 2019;Deng et al 2021;Wang et al 2021;Miao et al 2021). Criteo (Labs 2014) is a real-world CTR prediction dataset.…”
Section: Experiments Experimental Settingmentioning
confidence: 99%