Workload analysis and caching strategies for search advertising systems

Li, Conglong; Andersen, David G.; Fu, Qiang; Elnikety, Sameh; He, Yuxiong

doi:10.1145/3127479.3129255

Cited by 6 publications

(1 citation statement)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This exhaustive search is a one-time cost amortized across all online/offline queries as long as there is no change to the database and training vectors. From experience at Microsoft Bing, there could be hundreds of millions of latencysensitive online web search queries per day (that require ANN search) [38,39]. Thus the exhaustive search is a small cost compared to the total latency and computation reduction that the proposed approach can achieve over all queries.…”

Section: 13mentioning

confidence: 99%

Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination

Zhang

Andersen

et al. 2020

Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data

Self Cite

View full text Add to dashboard Cite

In applications ranging from image search to recommendation systems, the problem of identifying a set of "similar" real-valued vectors to a query vector plays a critical role. However, retrieving these vectors and computing the corresponding similarity scores from a large database is computationally challenging. Approximate nearest neighbor (ANN) search relaxes the guarantee of exactness for efficiency by vector compression and/or by only searching a subset of database vectors for each query. Searching a larger subset increases both accuracy and latency. State-of-the-art ANN approaches use fixed configurations that apply the same termination condition (the size of subset to search) for all queries, which leads to undesirably high latency when trying to achieve the last few percents of accuracy. We find that due to the index structures and the vector distributions, the number of database vectors that must be searched to find the ground-truth nearest neighbor varies widely among queries. Critically, we further identify that the intermediate search result after a certain amount of search is an important runtime feature that indicates how much more search should be performed. To achieve a better tradeoff between latency and accuracy, we propose a novel approach that adaptively determines

show abstract

Section: 13mentioning

confidence: 99%