Proceedings of the 14th International Conference on World Wide Web - WWW '05 2005
DOI: 10.1145/1060745.1060785
|View full text |Cite
|
Sign up to set email alerts
|

Three-level caching for efficient query processing in large Web search engines

Abstract: Large web search engines have to answer thousands of queries per second with interactive response times. Due to the sizes of the data sets involved, often in the range of multiple terabytes, a single query may require the processing of hundreds of megabytes or more of index data. To keep up with this immense workload, large search engines employ clusters of hundreds or thousands of machines, and a number of techniques such as caching, index compression, and index and query pruning are used to improve scalabili… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
19
0
1

Year Published

2009
2009
2020
2020

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 120 publications
(20 citation statements)
references
References 39 publications
0
19
0
1
Order By: Relevance
“…In modern search engines, query processing represents one of the major performance bottlenecks, so caching can help to speed up the search engine performance as well as to reduce the latency perceived by the users. Caching can be applied at different granularity including query results [28], posting lists of query terms [39], and posting list intersections [31]. Saraiva et al [39] proposed a two-level architecture where the front-end machine caches the results of popular queries, while the back-end machines have a cache for the posting lists of most frequently requested terms.…”
Section: Related Workmentioning
confidence: 99%
“…In modern search engines, query processing represents one of the major performance bottlenecks, so caching can help to speed up the search engine performance as well as to reduce the latency perceived by the users. Caching can be applied at different granularity including query results [28], posting lists of query terms [39], and posting list intersections [31]. Saraiva et al [39] proposed a two-level architecture where the front-end machine caches the results of popular queries, while the back-end machines have a cache for the posting lists of most frequently requested terms.…”
Section: Related Workmentioning
confidence: 99%
“…Altingovde, Ozcan, Cambazoglu, and Ulusoy (2011) coupled a result cache with a document id cache, which stores only document ids without snippets, further reducing the query traffic going to the backend search system. Long and Suel (2005) introduce, on top of result and posting list caching, a third level of cache where precomputed intersections of posting lists are stored. Li, Lee, Sivasubramaniam, and Giles (2007) propose a hybrid architecture involving result, posting list, and document caches.…”
Section: Related Workmentioning
confidence: 99%
“…A further approach involves caching portions of a query (i.e., pairs of terms), as initially proposed in [14] and extended in [6]. This approach is named Intersection Caching and is implemented at search node level as well.…”
Section: Gabriel Tolosa Luca Becchetti Esteban Feuerstein Alberto mentioning
confidence: 99%