2016
DOI: 10.1145/2996452
|View full text |Cite
|
Sign up to set email alerts
|

Exact and Approximate Maximum Inner Product Search with LEMP

Abstract: We study exact and approximate methods for maximum inner product search, a fundamental problem in a number of data mining and information retrieval tasks. We propose the LEMP framework, which supports both exact and approximate search with quality guarantees. At its heart, LEMP transforms a maximum inner product search problem over a large database of vectors into a number of smaller cosine similarity search problems. This transformation allows LEMP to prune large parts of the search space immediately and to s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
27
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 28 publications
(27 citation statements)
references
References 28 publications
0
27
0
Order By: Relevance
“…Sequential scanning involves testing potential candidates sequentially, while weeding out unpromising ones. A prototypical approach is LEMP (Teflioudi & Gemulla, 2016;Teflioudi et al, 2015), which exploits the Cauchy-Schwartz inequality x u T y i ≤ ||x u ||.||y i ||.…”
Section: Sequential Scanningmentioning
confidence: 99%
See 1 more Smart Citation
“…Sequential scanning involves testing potential candidates sequentially, while weeding out unpromising ones. A prototypical approach is LEMP (Teflioudi & Gemulla, 2016;Teflioudi et al, 2015), which exploits the Cauchy-Schwartz inequality x u T y i ≤ ||x u ||.||y i ||.…”
Section: Sequential Scanningmentioning
confidence: 99%
“…This avoids redundant inner product computations. (Teflioudi & Gemulla, 2016) and (Teflioudi et al, 2015) further improve the efficiency of LEMP by proposing an incremental pruning technique that refines the upper-bounds by computing the partial inner product over the first several dimensions. FEXIPRO in (Li et al, 2017) adopts the same framework and applies singular value decomposition to the user and item matrices to make the first dimensions more meaningful, effectively improving the upper bounds and facilitating the pruning process.…”
Section: Sequential Scanningmentioning
confidence: 99%
“…Despite being one of the most central problems in similarity search and having numerous applications [48,43,15,64,65,68,17,16,18,60,69,71,14,50,11,70,34,33], until recently it was unclear whether there could be a near-linear time, 1.1-approximation algorithm, before the recent breakthrough by Abboud, Rubinstein and Williams [5]. 3 In [5], a framework for proving inapproximability results for problems in P is established (the distributed PCP framework), from which it follows: Theorem 1.2 (Abbaud, Rubinstein, Williams 2017).…”
Section: Motivation and Background 1hardness Of Approximate Max-ipmentioning
confidence: 99%
“…LEMP. In SIGMOD 2015 [34] and TODS 2016 [33], Teflioudi et al introduced the LEMP index, which empirically outperformed all prior approaches. LEMP solves the MIPS problem using a divide-and-conquer approach: first, it sorts the item vectors by length and partitions them into buckets, such that each bucket contains vectors of roughly equal magnitude.…”
Section: Existing Mips Indexesmentioning
confidence: 99%
“…The fourth dataset, GloVe-Twitter, contains high-dimensional (up to f = 200) word embeddings generated from a corpus of Tweets, and has been previously used to benchmark approximate nearest-neighbor and MIPS algorithms. Per [33], we use the same permutation to select user vectors from the dataset, and use the remaining vectors as item vectors. To begin, the authors of LEMP [34] have made the models used in their evaluation publicly available; 1 for the Netflix dataset, these models were trained using Distributed Stochastic Gradient Descent, as described in [35], and we denote these models by *-DSGD throughout this section.…”
Section: A Experimental Setupmentioning
confidence: 99%