Proceedings of the 18th ACM Conference on Information and Knowledge Management 2009
DOI: 10.1145/1645953.1646012
|View full text |Cite
|
Sign up to set email alerts
|

Low-cost management of inverted files for online full-text search

Abstract: In dynamic environments with frequent content updates, we require online full-text search that scales to large data collections and achieves low search latency. Several recent methods that support fast incremental indexing of documents typically keep on disk multiple partial index structures that they continuously update as new documents are added. However, spreading indexing information across multiple locations on disk tends to considerably decrease the search responsiveness of the system. In the present pap… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2010
2010
2023
2023

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 14 publications
(7 citation statements)
references
References 19 publications
(46 reference statements)
0
7
0
Order By: Relevance
“…The accumulated postings are routinely combined with the rest of the data in a hierarchy based on geometric partitioning [8]. More advanced techniques exist [7,9], but they would not change the tradeoffs that we show. We use PForDelta [15] for compression, which has shown efficient decompression performance in recent studies [13].…”
Section: Methodsmentioning
confidence: 93%
“…The accumulated postings are routinely combined with the rest of the data in a hierarchy based on geometric partitioning [8]. More advanced techniques exist [7,9], but they would not change the tradeoffs that we show. We use PForDelta [15] for compression, which has shown efficient decompression performance in recent studies [13].…”
Section: Methodsmentioning
confidence: 93%
“…The variety of cipher terms caused by key change will result in index update. For index update, Margaritis and Anastasiadis (2009) only flush selectively the terms with most posting lists in memory into disk to merge it with primary index when the memory gets full with new posting lists. Gurajada and Kumar (2009) propose a new merge-based index maintenance strategy for information retrieval systems.…”
Section: Related Workmentioning
confidence: 99%
“…In the buffer-and-flush approach, Margaritis and Anastasiadis [12] present an interesting alternative beyond the three strategies discussed above. They make a slightly different design choice: when the in-memory buffer reaches capacity, instead of flushing the entire in-memory index, they choose to flush only a portion of the term space (a contiguous range of terms based on lexicographic sort order), performing a merge with the corresponding on-disk portions of the inverted lists.…”
Section: Related Workmentioning
confidence: 99%