Lemp

Teflioudi, Christina; Gemulla, Rainer; Mykytiuk, Olga

doi:10.1145/2723372.2747647

Cited by 47 publications

(5 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For example, the use of factor models in recommender systems leads to matrix products B T C with B, C ∈ R m×n , m n, and n very large [29]. Another application is link prediction in graphs [26].…”

Section: C585 the P Largest Entries Of Ementioning

confidence: 99%

“…An important feature of our algorithms is that they can be treated as black boxes that can be applied to many different problems, in contrast to the more specialized algorithms designed for products of two matrices, such as those in [3], [29]. Since our algorithms only require the computation of matrix-vector products, they are relatively simple to implement and can serve as a benchmark for testing more specialized algorithms in multiple application areas.…”

mentioning

confidence: 99%

See 1 more Smart Citation

Estimating the Largest Elements of a Matrix

Higham

Relton

2016

SIAM J. Sci. Comput.

View full text Add to dashboard Cite

Abstract. We derive an algorithm for estimating the largest p ≥ 1 values a ij or |a ij | for an m × n matrix A, along with their locations in the matrix. The matrix is accessed using only matrixvector or matrix-matrix products. For p = 1 the algorithm estimates the norm A M := max i,j |a ij | or max i,j a ij . The algorithm is based on a power method for mixed subordinate matrix norms and iterates on n × t matrices, where t ≥ p is a parameter. For p = t = 1 we show that the algorithm is essentially equivalent to rook pivoting in Gaussian elimination; we also obtain a bound for the expected number of matrix-vector products for random matrices and give a class of counterexamples. Our numerical experiments show that for p = 1 the algorithm usually converges in just two iterations, requiring the equivalent of 4t matrix-vector products, and for t = 2 the algorithm already provides excellent estimates that are usually within a factor 2 of the largest element and frequently exact. For p > 1 we incorporate deflation to improve the performance of the algorithm. Experiments on real-life datasets show that the algorithm is highly effective in practice.

show abstract

Section: C585 the P Largest Entries Of Ementioning

confidence: 99%

mentioning

confidence: 99%

Estimating the Largest Elements of a Matrix

Higham

Relton

2016

SIAM J. Sci. Comput.

View full text Add to dashboard Cite

show abstract

“…The inner product search (IPS) problem is important in many fields, e.g., information retrieval [18,32,35], recommender systems [6,7,30], data mining [19,27], databases [22,24,29], artificial intelligence [13,36], and machine learning [14,23,34]. These fields usually consider the 𝑘 maximum inner product search (𝑘-MIPS) problem, which, given a query vector and an output size 𝑘, returns the 𝑘 vectors having the maximum inner product with the query.…”

Section: Introductionmentioning

confidence: 99%

“…○ We iterate the above operation until we have |Q 𝑘 | = 𝑘 2. State-of-the-art exact IPS algorithms are based on linear scan[22,29]. Trivially, approximation algorithms which do not guarantee that all vectors such that x • q ≥ 𝜏 are included in the search result cannot solve the FI-IPS problem correctly.…”

mentioning

confidence: 99%

Simpler is Much Faster: Fair and Independent Inner Product Search

Aoyama

Amagata

Fujita

et al. 2023

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval

View full text Add to dashboard Cite

The problem of inner product search (IPS) is important in many fields. Although maximum inner product search (MIPS) is often considered, its result is usually skewed and static. Users are hence hard to obtain diverse and/or new items by using the MIPS problem. Motivated by this, we formulate a new problem, namely the fair and independent IPS problem. Given a query, a threshold, and an output size 𝑘, this problem randomly samples 𝑘 items from a set of items such that the inner product of the query and item is not less than the threshold. For each item that satisfies the threshold, this problem is fair, because the probability that such an item is outputted is equal to that for each other item. This fairness can yield diversity and novelty, but this problem faces a computational challenge. Some existing (M)IPS techniques can be employed in this problem, but they require 𝑂 (𝑛) or 𝑜 (𝑛) time, where 𝑛 is the dataset size. To scale well to large datasets, we propose a simple yet efficient algorithm that runs in 𝑂 (log 𝑛 + 𝑘) expected time. We conduct experiments using real datasets, and the results demonstrate that our algorithm is up to 330 times faster than baselines. CCS CONCEPTS• Information systems → Retrieval efficiency.

show abstract

“…This paper considers the top-k inner product join problem, which is defined as follows: Given two sets X and Y of high-dimensional vectors and a result size k, top-k inner product join between X and Y retrieves k pairs of vectors x, y , where x ∈ X and y ∈ Y, with the largest inner product among X × Y. This problem has important applications, such as recommendation [1]- [3], information extraction [4], and finding outlier correlations [5]. More specifically, Figure 1 depicts a histogram of inner products of 1 million randomly sampled vector pairs in a Yahoo!…”

Section: Introductionmentioning

confidence: 99%

Approximate Top-k Inner Product Join with a Proximity Graph

Nakama

Amagata

Hara

2021

2021 IEEE International Conference on Big Data (Big Data)

View full text Add to dashboard Cite

This paper addresses the problem of top-k inner product join, which, given two sets of high-dimensional vectors and a result size k, outputs k pairs of vectors that have the largest inner product. This problem has important applications, such as recommendation, information extraction, and finding outlier correlation. Unfortunately, computing the exact answer incurs an expensive cost for large high-dimensional datasets. We therefore consider an approximate solution framework that efficiently retrieves k pairs of vectors with large inner products. To exploit this framework and obtain an accurate answer, we extend a state-of-the-art proximity graph for inner product search. We conduct experiments on real datasets, and the results show that our solution is faster and more accurate than baselines with state-of-the-art techniques.

show abstract

Lemp

Cited by 47 publications

References 21 publications

Estimating the Largest Elements of a Matrix

Estimating the Largest Elements of a Matrix

Simpler is Much Faster: Fair and Independent Inner Product Search

Approximate Top-k Inner Product Join with a Proximity Graph

Contact Info

Product

Resources

About