2022
DOI: 10.48550/arxiv.2204.06240
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU

Abstract: The click-through rate (CTR) prediction task is to predict whether a user will click on the recommended item. As mind-boggling amounts of data are produced online daily, accelerating CTR prediction model training is critical to ensuring an up-to-date model and reducing the training cost. One approach to increase the training speed is to apply large batch training. However, as shown in computer vision and natural language processing tasks, training with a large batch easily suffers from the loss of accuracy. Ou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 20 publications
0
2
0
Order By: Relevance
“…A recent work, CowClip [76], proposes to clip the gradient of feature embedding according to the frequencies of features and successfully scales the batch size to reduce the training time. Cowclip first considers the impact of skewed feature distribution from the perspective of optimization.…”
Section: Skewed Feature Distributionmentioning
confidence: 99%
See 1 more Smart Citation
“…A recent work, CowClip [76], proposes to clip the gradient of feature embedding according to the frequencies of features and successfully scales the batch size to reduce the training time. Cowclip first considers the impact of skewed feature distribution from the perspective of optimization.…”
Section: Skewed Feature Distributionmentioning
confidence: 99%
“…A typical industrial CTR prediction model, as exemplified in studies such as [68,74,77], comprises two main types of parameters: sparse embedding and dense network. However, sparse embedding parameters often account for more than 99% of the total parameter count [76].…”
Section: Related Work 51 Click-through Rate Predictionmentioning
confidence: 99%