Proceedings of the Eighth ACM International Conference on Web Search and Data Mining 2015
DOI: 10.1145/2684822.2685289
|View full text |Cite
|
Sign up to set email alerts
|

Delayed-Dynamic-Selective (DDS) Prediction for Reducing Extreme Tail Latency in Web Search

Abstract: A commercial web search engine shards its index among many servers, and therefore the response time of a search query is dominated by the slowest server that processes the query. Prior approaches target improving responsiveness by reducing the tail latency of an individual search server. They predict query execution time, and if a query is predicted to be long-running, it runs in parallel, otherwise it runs sequentially. These approaches are, however, not accurate enough for reducing a high tail latency when r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
46
1

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 42 publications
(47 citation statements)
references
References 24 publications
0
46
1
Order By: Relevance
“…Dynamic pruning techniques such as WAND and BMW o er some relief as they o er the potential to safely skip the decompression of postings and the scoring of documents that cannot make the current top K. is makes the exact response time of a query di cult to predict, as not every posting in the postings lists will be decompressed and scored. Nevertheless recent work has considered making accurate predictions on the e ciency of a query, either in terms of absolute response time [29], or in terms of those queries with response times exceeding a threshold [19,21].…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Dynamic pruning techniques such as WAND and BMW o er some relief as they o er the potential to safely skip the decompression of postings and the scoring of documents that cannot make the current top K. is makes the exact response time of a query di cult to predict, as not every posting in the postings lists will be decompressed and scored. Nevertheless recent work has considered making accurate predictions on the e ciency of a query, either in terms of absolute response time [29], or in terms of those queries with response times exceeding a threshold [19,21].…”
Section: Related Workmentioning
confidence: 99%
“…E ciency predictions facilitate a number of applications for ensuring e cient yet e ective retrieval -for instance, routing queries among busy replicated query shard servers [29]; selectively deploying multiple CPU cores for slow queries [19,21]; or adjusting the pruning aggressiveness or size of K for di erent queries [5,14,38]. Of these, the work of Tonello o et al [38] is among the most similar to ours, in that they vary the number of documents to be retrieved, K, as well as the pruning aggressiveness, before passing to a learning-to-rank re-ranking phase, based on the predicted execution time of the query.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Moreover, reducing each server's tail latency is critical when a request spans several servers and responses are aggregated from these servers. In this case, the slower servers typically dominate the response time [22].…”
Section: Introductionmentioning
confidence: 99%