2006
DOI: 10.1145/1132956.1132959
|View full text |Cite
|
Sign up to set email alerts
|

Inverted files for text search engines

Abstract: The technology underlying text search engines has advanced dramatically in the past decade. The development of a family of new index representations has led to a wide range of innovations in index storage, index construction, and query evaluation. While some of these developments have been consolidated in textbooks, many specific techniques are not widely known or the textbook descriptions are out of date. In this tutorial, we introduce the key techniques in the area, describing both a core implementation and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
681
0
8

Year Published

2007
2007
2014
2014

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 906 publications
(702 citation statements)
references
References 140 publications
1
681
0
8
Order By: Relevance
“…The most popular are inverted indexes [5,10,68]. Inverted indexes are efficient because their search strategy is based on the vocabulary (the set of distinct words in the text), which is usually much smaller than the text, and thus fits in main memory.…”
Section: Indexed Text Searchingmentioning
confidence: 99%
“…The most popular are inverted indexes [5,10,68]. Inverted indexes are efficient because their search strategy is based on the vocabulary (the set of distinct words in the text), which is usually much smaller than the text, and thus fits in main memory.…”
Section: Indexed Text Searchingmentioning
confidence: 99%
“…In the STR-tree, a leaf node includes entries in the form (optr, loc, oti), where optr is a pointer to an object in D, oti represents the tag information of an object, which is indexed by inverted lists [12]. A intermediate node contains these entries in the form (N ptr, MBR, Ntsum), where Ntsum represents the tag summary information of its child nodes referred by Nptrs.…”
Section: Str-tree: a Refined Hybrid Indexing Mechanismmentioning
confidence: 99%
“…Finally, all of node sets in I k are appended to list L(line 11). After all node sets L are found, we check each node set in L to see whether its MinRank score is less than uppC (line [12][13][14]. Those that cannot qualify the conditions are eliminated.…”
Section: Fig 3 the Construction Of Shadow Prefix-treementioning
confidence: 99%
“…As mentioned above, the multimap abstract data type is related to the inverted file and inverted index structures, which are well-known in text indexing applications (e.g., see Knuth [20]) and are also used in search engines (e.g., see Zobel and Moffat [31]). Cutting and Pedersen [12] describe an inverted file implementation that uses B-trees for the indexing structure and supports insertions, but doesn't support deletions efficiently.…”
Section: Introductionmentioning
confidence: 99%