2008
DOI: 10.1007/978-3-540-70575-8_32
|View full text |Cite
|
Sign up to set email alerts
|

Succinct Data Structures for Retrieval and Approximate Membership (Extended Abstract)

Abstract: The retrieval problem is the problem of associating data with keys in a set. Formally, the data structure must store a function f : U → {0, 1} r that has specified values on the elements of a given set S ⊆ U , |S| = n, but may have any value on elements outside S. Minimal perfect hashing makes it possible to avoid storing the set S, but this induces a space overhead of Θ(n) bits in addition to the nr bits needed for function values. In this paper we show how to eliminate this overhead. Moreover, we show that f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
85
0

Year Published

2009
2009
2018
2018

Publication Types

Select...
7
2

Relationship

2
7

Authors

Journals

citations
Cited by 65 publications
(89 citation statements)
references
References 32 publications
2
85
0
Order By: Relevance
“…The storage space of the resulting PHFs and MPHFs are distant from the information theoretic lower bound by a factor of 1.43. The closest competitor is the algorithm by Martin and Pagh [7] but their algorithm do not work in linear time. Furthermore, the CHD algorithm can be tuned to run faster than the BPZ algorithm [2] (the fastest algorithm available in the literature so far) and to obtain more compact functions.…”
Section: Discussionmentioning
confidence: 99%
“…The storage space of the resulting PHFs and MPHFs are distant from the information theoretic lower bound by a factor of 1.43. The closest competitor is the algorithm by Martin and Pagh [7] but their algorithm do not work in linear time. Furthermore, the CHD algorithm can be tuned to run faster than the BPZ algorithm [2] (the fastest algorithm available in the literature so far) and to obtain more compact functions.…”
Section: Discussionmentioning
confidence: 99%
“…Upper bounds on the theshold can found by again viewing the problem as an orientation problem on random hypergraphs, and while some additional considerations are needed, an upper bound can be calculated [5]. Lower bounds have been achieved, based on a new approach for designing dictionary and retrieval structures, based on matrix techniques [13]. (See also [33].)…”
Section: Threshold Loads For Cuckoo Hashingmentioning
confidence: 99%
“…Storing the vector is then sufficient to generate the value associated with each key, and further requires just d lookups into the vector. As a specific example, for the important case of d = 3, there is an upper bound of 0.9183 for the threshold load [5], and a lower bound of 0.8894 [13]. Again, however, the question of bounds for efficient algorithms in the online setting remains more open.…”
Section: Threshold Loads For Cuckoo Hashingmentioning
confidence: 99%
“…An obvious possibility is to store a minimal perfect hash function on S and use the resulting value to index a table of r n bits. Much better theoretical solutions were made available recently [6,9,29]: essentially, it is possible to evaluate a function in constant time storing just r n + o(n) bits. Since we are interested in practical applications, however, we will use an extension of a technique developed by Majewski, Wormald, Havas and Czech [26] that has a slightly larger space usage, but has the advantage of being extremely fast, as it requires just the evaluation of three hash functions 1 plus three accesses to memory.…”
Section: Storing Functionsmentioning
confidence: 99%