Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data 2020
DOI: 10.1145/3318464.3380600
|View full text |Cite
|
Sign up to set email alerts
|

Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination

Abstract: In applications ranging from image search to recommendation systems, the problem of identifying a set of "similar" real-valued vectors to a query vector plays a critical role. However, retrieving these vectors and computing the corresponding similarity scores from a large database is computationally challenging. Approximate nearest neighbor (ANN) search relaxes the guarantee of exactness for efficiency by vector compression and/or by only searching a subset of database vectors for each query. Searching a large… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 29 publications
(15 citation statements)
references
References 47 publications
0
15
0
Order By: Relevance
“…In parallel to our work, Li et al [52] proposed a machine learning method, developed on top of inverted-file (IVF [43] and IMI [5]) and k-NN graph (HNSW [57]) similarity search techniques, that solves the problem of early termination of approximate NN queries, while achieving a target recall. In contrast, our approach employs similarity search techniques based on data series indices [31], and with a very small training set (up to 200 training queries in our experiments), provides guarantees with per-query probabilistic bounds along different dimensions: on the distance error, on whether the current answer is the exact one, and on the time needed to find the exact answer.…”
Section: Related Workmentioning
confidence: 99%
“…In parallel to our work, Li et al [52] proposed a machine learning method, developed on top of inverted-file (IVF [43] and IMI [5]) and k-NN graph (HNSW [57]) similarity search techniques, that solves the problem of early termination of approximate NN queries, while achieving a target recall. In contrast, our approach employs similarity search techniques based on data series indices [31], and with a very small training set (up to 200 training queries in our experiments), provides guarantees with per-query probabilistic bounds along different dimensions: on the distance error, on whether the current answer is the exact one, and on the time needed to find the exact answer.…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, HNSW is proposed to run on GPU and the algorithm is further accelerated [30]. Based on HNSW, a training strategy to adaptively determines when terminate is proposed in Reference [20] and improves the searching speed. Although both our HSSG and HNSW algorithms use a hierarchical structure, there are still some differences between these two methods.…”
Section: Graph-based Methodmentioning
confidence: 99%
“…For example, sparse retrieval methods (Section 3.6) often use the (weighted) inverted index to help find the top-𝐾 relevant documents efficiently, as that in ad hoc search [Croft et al, 2010]. Dense retrieval methods (Section 3.7, on the other hand, have to resort to efficient similarity search methods [Aumüller et al, 2017;Johnson et al, 2017;Li et al, 2020] to find relevant documents in a continuous vector space.…”
Section: Document Retrievalmentioning
confidence: 99%
“…Searching a larger subset increases both accuracy and latency. We review some commonly used ANN methods, following closely the descriptions in Li et al [2020]; Johnson et al [2017].…”
Section: Approximate Nearest Neighbor Searchmentioning
confidence: 99%