Tang, Haohua scite author profile

A recent series of papers by Waingarten (STOC 2018, FOCS 2018) has given approximate near neighbour search (NNS) data structures for a wide class of distance metrics, including all norms. In particular, these data structures achieve approximation on the order of p for ℓ d p norms with space complexity nearly linear in the dataset size n and polynomial in the dimension d, and query time sub-linear in n and polynomial in d. The main shortcoming is the exponential in d pre-processing time required for their construction.In this paper, we describe a more direct framework for constructing NNS data structures for general norms. More specifically, we show via an algorithmic reduction that an efficient NNS data structure for a metric M is implied by an efficient average distortion embedding of M into ℓ1 or the Euclidean space. In particular, the resulting data structures require only polynomial pre-processing time, as long as the embedding can be computed in polynomial time.As a concrete instantiation of this framework, we give an NNS data structure for ℓp with efficient preprocessing that matches the approximation factor, space and query complexity of the aforementioned data structure of Andoni et al. On the way, we resolve a question of Naor (Analysis and Geometry in Metric Spaces, 2014) and provide an explicit, efficiently computable embedding of ℓp, for p ≥ 2, into ℓ2 with (quadratic) average distortion on the order of p. Furthermore, we also give data structures for Schatten-p spaces with improved space and query complexity, albeit still requiring exponential pre-processing when p ≥ 2. We expect our approach to pave the way for constructing efficient NNS data structures for all norms.

show abstract

Gaussian Noise is Nearly Instance Optimal for Private Unbiased Mean Estimation

Nikolov¹,

Haohua²

2023

Preprint

View full text Add to dashboard Cite

We investigate unbiased high-dimensional mean estimators in differential privacy. We consider differentially private mechanisms whose expected output equals the mean of the input dataset, for every dataset drawn from a fixed convex domain K in R d . In the setting of concentrated differential privacy, we show that, for every input such an unbiased mean estimator introduces approximately at least as much error as a mechanism that adds Gaussian noise with a carefully chosen covariance. This is true when the error is measured with respect to ℓ p error for any p ≥ 2. We extend this result to local differential privacy, and to approximate differential privacy, but for the latter the error lower bound holds either for a dataset or for a neighboring dataset. We also extend our results to mechanisms that take i.i.d. samples from a distribution over K and are unbiased with respect to the mean of the distribution.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tang, Haohua

Near Neighbor Search via Efficient Average Distortion Embeddings

Near Neighbor Search via Efficient Average Distortion Embeddings

Gaussian Noise is Nearly Instance Optimal for Private Unbiased Mean Estimation

Contact Info

Product

Resources

About