Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining 2020
DOI: 10.1145/3394486.3403059
|View full text |Cite
|
Sign up to set email alerts
|

Compositional Embeddings Using Complementary Partitions for Memory-Efficient Recommendation Systems

Abstract: Modern deep learning-based recommendation systems exploit hundreds to thousands of different categorical features, each with millions of different categories ranging from clicks to posts. To respect the natural diversity within the categorical data, embeddings map each category to a unique dense representation within an embedded space. Since each categorical feature could take on as many as tens of millions of different possible categories, the embedding tables form the primary memory bottleneck during both tr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 46 publications
(26 citation statements)
references
References 21 publications
0
19
0
Order By: Relevance
“…Our test set results of 0.4442 and 0.4454 for the lowest loss and most efficient embedding cardinality search architectures respectively are significantly in excess of the ∼0.447 (estimated from graphs by counting pixels) reported in figure 5 of [26] as their DLRM baseline. Our latter efficient result also uses ∼12× fewer parameters than the 5.4 × 10 8 reported for that baseline.…”
Section: Comparisons To Prior Workmentioning
confidence: 60%
See 1 more Smart Citation
“…Our test set results of 0.4442 and 0.4454 for the lowest loss and most efficient embedding cardinality search architectures respectively are significantly in excess of the ∼0.447 (estimated from graphs by counting pixels) reported in figure 5 of [26] as their DLRM baseline. Our latter efficient result also uses ∼12× fewer parameters than the 5.4 × 10 8 reported for that baseline.…”
Section: Comparisons To Prior Workmentioning
confidence: 60%
“…Our best result for embedding cardinality search compresses the total size of embedding tables 15.14× with a relative 0.0012 increase in loss, demonstrating the promise of our approach (see sections VI-C3 and VIII-C). Our approach discovered recommendation models that beat the state-of-the-art in terms of logloss with significantly fewer parameters (0.4442 vs. 0.447 of [26]; see VIII-D). Moreover, our approach discovered this model using 52× less computational effort (see section VIII-D).…”
Section: Introductionmentioning
confidence: 99%
“…They ensure that the most frequent ones will be assigned a unique embedding (zero collisions for those) and the rest will be hashed to shared embeddings using two hash functions (double hashing). In a recent work, Shi et al [21] create a unique embedding for each category by composing shared entries from multiple smaller embedding tables. Again in the recommendation domain, Kang et al [8] propose DHE that replaces one-hot encodings with dense vectors from multiple hash functions.…”
Section: Related Workmentioning
confidence: 99%
“…To improve content personalization, recommendation models are growing rapidly in size and complexity [39,58,59,60]. Tackling the growing model sizes, researchers have proposed techniques to compress embedding tables while preserving accuracy [12,14,46,52]. Alternatively, one can decompose large monolithic models into multi-stage pipelines.…”
Section: Related Workmentioning
confidence: 99%