Consider the setting where multiple parties each hold a multiset of users and the task is to estimate the reach (i.e., the number of distinct users appearing across all parties) and the frequency histogram (i.e., fraction of users appearing a given number of times across all parties). In this work we introduce a new sketch for this task, based on an exponentially distributed counting Bloom filter. We combine this sketch with a communication-efficient multi-party protocol to solve the task in the multi-worker setting. Our protocol exhibits both differential privacy and security guarantees in the honest-but-curious model and in the presence of large subsets of colluding workers; furthermore, its reach and frequency histogram estimates have a provably small error. Finally, we show the practicality of the protocol by evaluating it on internet-scale audiences.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.