Exact and Approximate Maximum Inner Product Search with LEMP

Teflioudi, Christina; Gemulla, Rainer

doi:10.1145/2996452

Cited by 28 publications

(27 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Sequential scanning involves testing potential candidates sequentially, while weeding out unpromising ones. A prototypical approach is LEMP (Teflioudi & Gemulla, 2016;Teflioudi et al, 2015), which exploits the Cauchy-Schwartz inequality x u T y i ≤ ||x u ||.||y i ||.…”

Section: Sequential Scanningmentioning

confidence: 99%

“…This avoids redundant inner product computations. (Teflioudi & Gemulla, 2016) and (Teflioudi et al, 2015) further improve the efficiency of LEMP by proposing an incremental pruning technique that refines the upper-bounds by computing the partial inner product over the first several dimensions. FEXIPRO in (Li et al, 2017) adopts the same framework and applies singular value decomposition to the user and item matrices to make the first dimensions more meaningful, effectively improving the upper bounds and facilitating the pruning process.…”

Section: Sequential Scanningmentioning

confidence: 99%

See 1 more Smart Citation

Efficient Retrieval of Matrix Factorization-Based Top-k Recommendations: A Survey of Recent Approaches

Le¹,

Lauw²

2021

jair

View full text Add to dashboard Cite

Top-k recommendation seeks to deliver a personalized list of k items to each individual user. An established methodology in the literature based on matrix factorization (MF), which usually represents users and items as vectors in low-dimensional space, is an effective approach to recommender systems, thanks to its superior performance in terms of recommendation quality and scalability. A typical matrix factorization recommender system has two main phases: preference elicitation and recommendation retrieval. The former analyzes user-generated data to learn user preferences and item characteristics in the form of latent feature vectors, whereas the latter ranks the candidate items based on the learnt vectors and returns the top-k items from the ranked list. For preference elicitation, there have been numerous works to build accurate MF-based recommendation algorithms that can learn from large datasets. However, for the recommendation retrieval phase, naively scanning a large number of items to identify the few most relevant ones may inhibit truly real-time applications. In this work, we survey recent advances and state-of-the-art approaches in the literature that enable fast and accurate retrieval for MF-based personalized recommendations. Also, we include analytical discussions of approaches along different dimensions to provide the readers with a more comprehensive understanding of the surveyed works.

show abstract

Section: Sequential Scanningmentioning

confidence: 99%

Section: Sequential Scanningmentioning

confidence: 99%

Efficient Retrieval of Matrix Factorization-Based Top-k Recommendations: A Survey of Recent Approaches

Le¹,

Lauw²

2021

jair

View full text Add to dashboard Cite

show abstract

“…Despite being one of the most central problems in similarity search and having numerous applications [48,43,15,64,65,68,17,16,18,60,69,71,14,50,11,70,34,33], until recently it was unclear whether there could be a near-linear time, 1.1-approximation algorithm, before the recent breakthrough by Abboud, Rubinstein and Williams [5]. 3 In [5], a framework for proving inapproximability results for problems in P is established (the distributed PCP framework), from which it follows: Theorem 1.2 (Abbaud, Rubinstein, Williams 2017).…”

Section: Motivation and Background 1hardness Of Approximate Max-ipmentioning

confidence: 99%

Untitled

Chen¹

2020

Theory of Comput.

View full text Add to dashboard Cite

In this paper we study the (Bichromatic) Maximum Inner Product Problem (Max-IP), in which we are given sets A and B of vectors, and the goal is to find a ∈ A and b ∈ B maximizing inner product a • b. Max-IP is a basic question and serves as the base problem in the recent breakthrough of [Abboud et al., FOCS 2017] on hardness of approximation for polynomial-time problems. It is also used (implicitly) in the argument for hardness of exact 2-Furthest Pair (and other important problems in computational geometry) in poly-loglog dimensions in [Williams, SODA 2018]. We have three main results regarding this problem. * Supported by an Akamai Fellowship.

show abstract

“…LEMP. In SIGMOD 2015 [34] and TODS 2016 [33], Teflioudi et al introduced the LEMP index, which empirically outperformed all prior approaches. LEMP solves the MIPS problem using a divide-and-conquer approach: first, it sorts the item vectors by length and partitions them into buckets, such that each bucket contains vectors of roughly equal magnitude.…”

Section: Existing Mips Indexesmentioning

confidence: 99%

“…The fourth dataset, GloVe-Twitter, contains high-dimensional (up to f = 200) word embeddings generated from a corpus of Tweets, and has been previously used to benchmark approximate nearest-neighbor and MIPS algorithms. Per [33], we use the same permutation to select user vectors from the dataset, and use the remaining vectors as item vectors. To begin, the authors of LEMP [34] have made the models used in their evaluation publicly available; 1 for the Netflix dataset, these models were trained using Distributed Stochastic Gradient Descent, as described in [35], and we denote these models by *-DSGD throughout this section.…”

Section: A Experimental Setupmentioning

confidence: 99%

To Index or Not to Index: Optimizing Exact Maximum Inner Product Search

Abuzaid

Sethi

Bailis

et al. 2019

2019 IEEE 35th International Conference on Data Engineering (ICDE)

View full text Add to dashboard Cite

Exact Maximum Inner Product Search (MIPS) is an important task that is widely pertinent to recommender systems and high-dimensional similarity search. The brute-force approach to solving exact MIPS is computationally expensive, thus spurring recent development of novel indexes and pruning techniques for this task. In this paper, we show that a hardware-efficient bruteforce approach, blocked matrix multiply (BMM), can outperform the state-of-the-art MIPS solvers by over an order of magnitude, for some-but not all-inputs.In this paper we also present a novel MIPS solution, MAX-IMUS, that takes advantage of hardware efficiency and pruning of the search space. Like BMM, MAXIMUS is faster than other solvers by up to an order of magnitude, but again only for some inputs. Since no single solution offers the best runtime performance for all inputs, we introduce a new data-dependent optimizer, OPTIMUS, that selects online with minimal overhead the best MIPS solver for a given input. Together, OPTIMUS and MAXIMUS outperform state-of-the-art MIPS solvers by 3.2× on average, and up to 10.9×, on widely studied MIPS datasets.

show abstract

Exact and Approximate Maximum Inner Product Search with LEMP

Cited by 28 publications

References 28 publications

Efficient Retrieval of Matrix Factorization-Based Top-k Recommendations: A Survey of Recent Approaches

Efficient Retrieval of Matrix Factorization-Based Top-k Recommendations: A Survey of Recent Approaches

Untitled

To Index or Not to Index: Optimizing Exact Maximum Inner Product Search

Contact Info

Product

Resources

About