Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2013
DOI: 10.1145/2487575.2488221
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic memory allocation policies for postings in real-time Twitter search

Abstract: We explore a real-time Twitter search application where tweets are arriving at a rate of several thousands per second. Real-time search demands that they be indexed and searchable immediately, which leads to a number of implementation challenges. In this paper, we focus on one aspect: dynamic postings allocation policies for index structures that are completely held in main memory. The core issue can be characterized as a "Goldilocks Problem". Because memory remains today a scare resource, an allocation policy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
4
2

Relationship

3
3

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 15 publications
(23 reference statements)
0
3
0
Order By: Relevance
“…Note that Earlybird's design can be extended to an arbitrary number of pools and pool sizes; we can create more memory pools or use different slice sizes-this is further explored in Asadi et al [2013] A particular instantiation of this general strategy can be described by P = {P 1 , P 2 , . .…”
Section: Postings Allocation For In-memory Incremental Indexingmentioning
confidence: 99%
“…Note that Earlybird's design can be extended to an arbitrary number of pools and pool sizes; we can create more memory pools or use different slice sizes-this is further explored in Asadi et al [2013] A particular instantiation of this general strategy can be described by P = {P 1 , P 2 , . .…”
Section: Postings Allocation For In-memory Incremental Indexingmentioning
confidence: 99%
“…In real-time search, there is another important advantage: users most often care only about the latest results. With an inverted index, it is desirable to traverse postings lists "backwards" (from most recent) and early exit when enough results have been accumulated [2]. Most systems are not designed this way, which foregoes optimization opportunities; adapting traditional query evaluation algorithms to operate in this manner is non-trivial.…”
Section: Future Work and Conclusionmentioning
confidence: 99%
“…Despite alternative approaches based on approximate nearest-neighbor search [2,9], inverted indexes-in combination with modern query evaluation algorithms such as block-max Wand [5]-remain the standard by which other retrieval techniques are judged. In this paper, we focus on inverted indexing applied to static document collections and explicitly leave aside the socalled real-time indexing problem, where high velocity document streams need to be ingested and made immediately searchable; such a scenario calls for different techniques than when working with static collections [3,1,12].…”
Section: Introductionmentioning
confidence: 99%