Proceedings of the 16th ACM Conference on Recommender Systems 2022
DOI: 10.1145/3523227.3546765
|View full text |Cite
|
Sign up to set email alerts
|

A GPU-specialized Inference Parameter Server for Large-Scale Deep Recommendation Models

Abstract: Recommendation systems are of crucial importance for a variety of modern apps and web services, such as news feeds, social networks, e-commerce, search, etc. To achieve peak prediction accuracy, modern recommendation models combine deep learning with terabytescale embedding tables to obtain a fine-grained representation of the underlying data. Traditional inference serving architectures require deploying the whole model to standalone servers, which is infeasible at such massive scale.In this paper, we provide … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
references
References 25 publications
0
0
0
Order By: Relevance