Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval 2012
DOI: 10.1145/2348283.2348317
|View full text |Cite
|
Sign up to set email alerts
|

Efficient in-memory top-k document retrieval

Abstract: For over forty years the dominant data structure for ranked document retrieval has been the inverted index. Inverted indexes are effective for a variety of document retrieval tasks, and particularly efficient for large data collection scenarios that require disk access and storage. However, many efficiency-bound search tasks can now easily be supported entirely in-memory as a result of recent hardware advances.In this paper we present a hybrid algorithmic framework for inmemory bag-of-words ranked document ret… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2012
2012
2015
2015

Publication Types

Select...
5
3
1

Relationship

3
6

Authors

Journals

citations
Cited by 27 publications
(17 citation statements)
references
References 42 publications
(60 reference statements)
0
17
0
Order By: Relevance
“…Our compact framework is based on encoding these pointers in smaller amount of bits, while the compressed framework further samples these pointers as they pass through some specially chosen nodes. These frameworks are fairly general and have also been shown to be practical [Patil et al 2011;Culpepper et al 2012;Belazzougui et al 2013]. Even though efficient solutions are already available for the central problem, there are still many interesting variations and open questions one could ask about.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our compact framework is based on encoding these pointers in smaller amount of bits, while the compressed framework further samples these pointers as they pass through some specially chosen nodes. These frameworks are fairly general and have also been shown to be practical [Patil et al 2011;Culpepper et al 2012;Belazzougui et al 2013]. Even though efficient solutions are already available for the central problem, there are still many interesting variations and open questions one could ask about.…”
Section: Resultsmentioning
confidence: 99%
“…Navarro and Nekrich [2012b] gave an index of size O(n(log σ + log D)) bits index with optimal O( p+ k) time; however, the hidden constants within the big-O notations are not small in practice [Konow and Navarro 2013]. It has been shown that, compact space indexes provide the best practical performance [Konow and Navarro 2013;Culpepper et al 2010] compared to linear space indexes [Patil et al 2011] (which are less efficient in terms of space occupancy) and the succinct space indexes [Culpepper et al 2012; (which are less efficient in terms of query processing time). See also Hsu and Ottaviano [2013] for a related result on top-k completion.…”
Section: Postscriptmentioning
confidence: 99%
“…Their index, on the other hand, turns out to be very competitive in practice. There exist several other indexes of practical interest [22,6,23,24,25].…”
Section: Sourcementioning
confidence: 99%
“…We compare their best performing variant, GREEDY, in this paper. Culpepper et al [5] adapted the scheme to large natural language text collections (where each word is taken as an atomic symbol), showing that it was competitive with inverted indexes for some queries (see previous work on this line by Patil et al [20]). The seminal work of Hon et al [13] also included succinct variants, which were implemented by Navarro and Valenzuela [19] on top of a compressed representation of D.…”
Section: Basic Conceptsmentioning
confidence: 99%