We show an optimal data-dependent hashing scheme for the approximate near neighbor problem. For an n-point dataset in a d-dimensional space our data structure achieves query time O(d · n ρ+o(1) ) and space O(n 1+ρ+o(1) + d · n), where ρ = 1 2c 2 −1 for the Euclidean space and approximation c > 1. For the Hamming space, we obtain an exponent of ρ = 1 2c−1 . Our result completes the direction set forth in [5] who gave a proof-of-concept that data-dependent hashing can outperform classic Locality Sensitive Hashing (LSH). In contrast to [5], the new bound is not only optimal, but in fact improves over the best (optimal) LSH data structures [15,3] for all approximation factors c > 1.From the technical perspective, we proceed by decomposing an arbitrary dataset into several subsets that are, in a certain sense, pseudo-random.
We show tight upper and lower bounds for time-space trade-offs for the c-approximate Near Neighbor Search problem. For the d-dimensional Euclidean space and npoint datasets, we develop a data structure with space n 1+ρu+o(1) + O(dn) and query time n ρq+o(1) + dn o(1) for every ρ u , ρ q ≥ 0 with:In particular, for the approximation c = 2 we get:• Space n 1.77... and query time n o(1) , significantly improving upon known data structures that support very fast queries [IM98, KOR00];• Space n 1.14... and query time n 0.14... , matching the optimal data-dependent Locality-Sensitive Hashing (LSH) from [AR15];• Space n 1+o(1) and query time n 0.43... , making significant progress in the regime of near-linear space, which is arguably of the most interest for prac-This is the first data structure that achieves sublinear query time and near-linear space for every approximation factor c > 1, improving upon [Kap15]. The data structure is a culmination of a long line of work on the problem for all space regimes; it builds on Spherical Locality-Sensitive Filtering [BDGL16] and datadependent hashing [AINR14, AR15]. Our matching lower bounds are of two types: conditional and unconditional. First, we prove tightness of the whole trade-off (0.1) in a restricted model of computation, which captures all known hashing-based approaches. We then show unconditional cell-probe lower * This paper merges two arXiv preprints: [Laa15c] (appeared online on November 24, 2015) and [ALRW16] (appeared online on May 9, 2016), and subsumes both of these articles. The full version containing all the proofs is available at https://arxiv.org/abs/1608.03580 bounds for one and two probes that match (0.1) for ρ q = 0, improving upon the best known lower bounds from [PTW10]. In particular, this is the first space lower bound (for any static data structure) for two probes which is not polynomially smaller than the one-probe bound. To show the result for two probes, we establish and exploit a connection to locally-decodable codes.
We present a new data structure for the -approximate near neighbor problem (ANN) in the Euclidean space. For points in R , our algorithm achieves ( + log ) query time and ( 1+
Abstract. We present a novel approach to graph partitioning based on the notion of natural cuts. Our algorithm, called PUNCH, has two phases. The first phase performs a series of minimum-cut computations to identify and contract dense regions of the graph. This reduces the graph size, but preserves its general structure. The second phase uses a combination of greedy and local search heuristics to assemble the final partition. The algorithm performs especially well on road networks, which have an abundance of natural cuts (such as bridges, mountain passes, and ferries). In a few minutes, it obtains excellent partitions for continental-sized networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.