Approximate Nearest Neighbor Search under Neural Similarity Metric for Large-Scale Recommendation

Chen, Rihan; Liu, Bin; Zhu, Han; Wang, Yaoxuan; Li, Qi; Ma, Buting; Hua, Qingbo; Jiang, Jun; Xu, Yunlong; Deng, Hongbo; Zheng, Bo

doi:10.1145/3511808.3557098

Cited by 8 publications

(5 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In some cases, we may need much more results, such as thousands. For example, in recommendation systems, a large number of candidates are first recalled and then filtered to get the final recommendations [17,67]. Fig.…”

Section: Large-scale Search Resultsmentioning

confidence: 99%

Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment

Wang,

Xu,

et al. 2024

Proc. ACM Manag. Data

View full text Add to dashboard Cite

High-dimensional vector similarity search (HVSS) is gaining prominence as a powerful tool for various data science and AI applications. As vector data scales up, in-memory indexes pose a significant challenge due to the substantial increase in main memory requirements. A potential solution involves leveraging disk-based implementation, which stores and searches vector data on high-performance devices like NVMe SSDs. However, implementing HVSS for data segments proves to be intricate in vector databases where a single machine comprises multiple segments for system scalability. In this context, each segment operates with limited memory and disk space, necessitating a delicate balance between accuracy, efficiency, and space cost. Existing disk-based methods fall short as they do not holistically address all these requirements simultaneously. In this paper, we present Starling, an I/O-efficient disk-resident graph index framework that optimizes data layout and search strategy within the segment. It has two primary components: (1) a data layout incorporating an in-memory navigation graph and a reordered disk-based graph with enhanced locality, reducing the search path length and minimizing disk bandwidth wastage; and (2) a block search strategy designed to minimize costly disk I/O operations during vector query execution. Through extensive experiments, we validate the effectiveness, efficiency, and scalability of Starling. On a data segment with 2GB memory and 10GB disk capacity, Starling can accommodate up to 33 million vectors in 128 dimensions, offering HVSS with over 0.9 average precision and top-10 recall rate, and latency under 1 millisecond. The results showcase Starling's superior performance, exhibiting 43.9x higher throughput with 98% lower query latency compared to state-of-the-art methods while maintaining the same level of accuracy.

show abstract

Section: Large-scale Search Resultsmentioning

confidence: 99%

Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment

Wang,

Xu,

et al. 2024

Proc. ACM Manag. Data

View full text Add to dashboard Cite

show abstract

“…Moreover, work has been done to leverage current hardware capabilities, aside from algorithmic improvements. For example, when dealing with data that cannot be accommodated in memory, ANN methods like DiskANN [22] or SPANN [23] propose to use data locality and fast disk storage (Solid State Disks (SSD)). Also, multi-threading has been exploited in ANN, with examples such as SCANN [10], or through threading parallelism at the level of query processing.…”

Section: Related Workmentioning

confidence: 99%

Time-Quality Tradeoff of MuseHash Query Processing Performance

Pegia,

Lopez,

Moumtzidou

et al. 2024

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Nowadays, massive quantities of multimedia data, such as videos, images, text and audio, are generated by various applications on smartphones, drones and other devices. To facilitate efficient retrieval from these multimedia collections, we need (a) effective media representation and (b) efficient indexing and query processing approaches. Recently, the MuseHash approach was proposed, which can effectively represent a variety of modalities, improving on previous hashing-based approaches. However, the interaction of the MuseHash approach with existing indexing and query processing approaches has not been considered. This paper provides a systematic evaluation of a set of state-of-the-art approximate nearest neighbor search algorithms for image retrieval, when applied to the MuseHash approach, providing quantitative comparison results and evaluating the use of High-Performance Computing (HPC) infrastructures. An extensive set of experiments on a benchmark aerial dataset and on a real life-log dataset demonstrates the effectiveness of employing hashing and ANN techniques with HPC, resulting in reduced computational time.

show abstract

“…For example, (Jégou et al, 2011b) and (Baranchuk et al, 2018) add another refinement stage over the quantized embeddings and skip less promising clusters according to tailored heuristics. (Chen et al, 2021) create duplicated reference for boundary embeddings to improve recall with high efficiency. The other research thread optimizes the VQ index towards retrieval quality with cross-entropy loss instead of minimizing the reconstruction loss.…”

Section: Related Workmentioning

confidence: 99%

“…By increasing the number of clusters to scan, one may expect higher retrieval quality since the relevant document is more likely to be included, yet with higher query latency since there are more documents to evaluate (Jégou et al, 2011a). On top of the basic idea, recent studies improve the accuracy of IVF by grouping the cluster embeddings and skipping the least promising groups (Baranchuk et al, 2018), creating duplicated records for boundary embeddings (Chen et al, 2021), and end-to-end learning the cluster assignments by knowledge distillation (Xiao et al, 2022a). Despite their improvements, IVF still exhibits limited retrieval quality, especially when high efficiency is needed.…”

Section: Introductionmentioning

confidence: 99%

Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval

Zhang,

Liu,

Xiao

et al. 2023

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Inverted file structure is a common technique for accelerating dense retrieval. It clusters documents based on their embeddings; during searching, it probes nearby clusters w.r.t. an input query and only evaluates documents within them by subsequent codecs, thus avoiding the expensive cost of exhaustive traversal. However, the clustering is always lossy, which results in the miss of relevant documents in the probed clusters and hence degrades retrieval quality. In contrast, lexical matching, such as overlaps of salient terms, tends to be strong feature for identifying relevant documents. In this work, we present the Hybrid Inverted Index (HI 2 ), where the embedding clusters and salient terms work collaboratively to accelerate dense retrieval. To make best of both effectiveness and efficiency, we devise a cluster selector and a term selector, to construct compact inverted lists and efficiently searching through them. Moreover, we leverage simple unsupervised algorithms as well as end-to-end knowledge distillation to learn these two modules, with the latter further boosting the effectiveness. Based on comprehensive experiments on popular retrieval benchmarks, we verify that clusters and terms indeed complement each other, enabling HI 2 to achieve lossless retrieval quality with competitive efficiency across various index settings. Our code and checkpoint are publicly available at https://github.com/ namespace-Pt/Adon/tree/HI2.

show abstract

Approximate Nearest Neighbor Search under Neural Similarity Metric for Large-Scale Recommendation

Cited by 8 publications

References 18 publications

Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment

Starling: An I/O-Efficient Disk-Resident Graph Index Framework for High-Dimensional Vector Similarity Search on Data Segment

Time-Quality Tradeoff of MuseHash Query Processing Performance

Hybrid Inverted Index Is a Robust Accelerator for Dense Retrieval

Contact Info

Product

Resources

About