Cardinality estimation and dynamic length adaptation for Bloom filters

Papapetrou, Odysseas; Siberski, Wolf; Nejdl, Wolfgang

doi:10.1007/s10619-010-7067-2

Cited by 49 publications

(51 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The resulting Bloom filter has no false negative, which means the query result of any element y ∈ S 1 ∩ S 2 against BF S 1 ∩S 2 is always true. The false positive probability of the resulting Bloom filter is no higher than either of the constituent Bloom filter [38]. Note that due to collisions, it is possible that the jth bit is set in BF S 1 by an element in S 1 − S 1 ∩ S 2 and jth bit is set in BF S 2 by an element in S 2 − S 1 ∩ S 2 .…”

Section: Bloom Filtersmentioning

confidence: 99%

When private set intersection meets big data

Dong

Chen

Wen

2013

Proceedings of the 2013 ACM SIGSAC Conference on Computer &Amp; Communications Security - CCS '13

300

201

View full text Add to dashboard Cite

Large scale data processing brings new challenges to the design of privacy-preserving protocols: how to meet the increasing requirements of speed and throughput of modern applications, and how to scale up smoothly when data being protected is big. Efficiency and scalability become critical criteria for privacy preserving protocols in the age of Big Data. In this paper, we present a new Private Set Intersection (PSI) protocol that is extremely efficient and highly scalable compared with existing protocols. The protocol is based on a novel approach that we call oblivious Bloom intersection. It has linear complexity and relies mostly on efficient symmetric key operations. It has high scalability due to the fact that most operations can be parallelized easily. The protocol has two versions: a basic protocol and an enhanced protocol, the security of the two variants is analyzed and proved in the semi-honest model and the malicious model respectively. A prototype of the basic protocol has been built. We report the result of performance evaluation and compare it against the two previously fastest PSI protocols. Our protocol is orders of magnitude faster than these two protocols. To compute the intersection of two million-element sets, our protocol needs only 41 seconds (80-bit security) and 339 seconds (256-bit security) on moderate hardware in parallel mode.

show abstract

Section: Bloom Filtersmentioning

confidence: 99%

When private set intersection meets big data

Dong

Chen

Wen

2013

Proceedings of the 2013 ACM SIGSAC Conference on Computer &Amp; Communications Security - CCS '13

300

201

View full text Add to dashboard Cite

show abstract

“…These bit arrays are similar to those employed in traditional Bloom filters and is supported by a sufficiently large body of research work [14], [16], [17] that allows us to estimate number of documents reachable for a multi-concept query solely based on these bit arrays. Similar to level 1, level 2(TSBF 2,P ) also contains multiple bit arrays each representing different multi-concept queries that whose concepts have C as the least common ancestor in the ontology hierarchy for which P has at least one qualified document in its local document collection (TSBF 2,P (C)).…”

Section: Two-level Semantic Bloom Filter (Tsbf)mentioning

confidence: 99%

“…Research community has proposed many works to estimate the cardinality(i.e. number of elements) of an original set solely based on its Bloom filter bit array [14], [16], [17]. For our work we used the work presented by authors of [16].…”

Section: ) Estimating Set Intersection Based Cardinality From Bloom mentioning

confidence: 99%

See 1 more Smart Citation

BSI: Bloom Filter-Based Semantic Indexing for Unstructured P2P Networksh

Dissanayaka¹,

Prasad²,

Navathe³

et al. 2015

IJP2P

View full text Add to dashboard Cite

show abstract

“…A single Bloom filter is used for all grams generated by a single string S. Papapetrou et al [22] conclude, that the optimal number of hash functions to do cardinality estimation using Bloom filters is 1. Based on this we fix k = 1 and only use a single hash function to build and query Bloom filters throughout the rest of the paper.…”

Section: String Matching Using Bloom Filtersmentioning

confidence: 99%

Approximate Two-Party Privacy-Preserving String Matching with Linear Complexity

Beck

Kerschbaum²

2013

2013 IEEE International Congress on Big Data

View full text Add to dashboard Cite

Abstract. Consider two parties who want to compare their strings, e.g., genomes, but do not want to reveal them to each other. We present a system for privacy-preserving matching of strings, which differs from existing systems by providing a deterministic approximation instead of an exact distance. It is efficient (linear complexity), non-interactive and does not involve a third party which makes it particularly suitable for cloud computing. We extend our protocol, such that it only reveals whether there is a match and not the exact distance. Further an implementation of the system is evaluated and compared against current privacy-preserving string matching algorithms.

show abstract

Cardinality estimation and dynamic length adaptation for Bloom filters

Cited by 49 publications

References 45 publications

When private set intersection meets big data

When private set intersection meets big data

BSI: Bloom Filter-Based Semantic Indexing for Unstructured P2P Networksh

Approximate Two-Party Privacy-Preserving String Matching with Linear Complexity

Contact Info

Product

Resources

About