We consider range queries that search for low-frequency elements (least frequent elements and α-minorities) in arrays. An α-minority of a query range has multiplicity no greater than an α fraction of the elements in the range. Our data structure for the least frequent element range query problem requires O(n) space, O(n 3/2 ) preprocessing time, and O( √ n) query time. A reduction from boolean matrix multiplication to this problem shows the hardness of simultaneous improvements in both preprocessing time and query time. Our data structure for the α-minority range query problem requires O(n) space, supports queries in O(1/α) time, and allows α to be specified at query time.
Given an array A of size n, we consider the problem of answering range majority queries: given a query range [i..j] where 1 ≤ i ≤ j ≤ n, return the majority element of the subarray A[i..j] if it exists. We describe a linear space data structure that answers range majority queries in constant time. We further generalize this problem by defining range α-majority queries: given a query range [i..j], return all the elements in the subarray A[i..j] with frequency greater than α(j − i + 1). We prove an upper bound on the number of α-majorities that can exist in a subarray, assuming that query ranges are restricted to be larger than a given threshold. Using this upper bound, we generalize our range majority data structure to answer range α-majority queries in O( 1 α ) time using O(n lg( 1 α + 1)) space, for any fixed α ∈ (0, 1). This result is interesting since other similar range query problems based on frequency have nearly logarithmic lower bounds on query time when restricted to linear space.
Given an array A of size n, we consider the problem of answering range majority queries:given a query range [i.. j] where 1 i j n, return the majority element of the subarray A[i.. j] if it exists. We describe a linear space data structure that answers range majority queries in constant time. We further generalize this problem by defining range α-majority queries: given a query range [i.. j], return all the elements in the subarray A[i.. j] with frequency greater than α( j − i + 1). We prove an upper bound on the number of α-majorities that can exist in a subarray, assuming that query ranges are restricted to be larger than a given threshold. Using this upper bound, we generalize our range majority data structure to answer range α-majority queries in O ( 1 α ) time using O (n lg( 1 α + 1)) space, for any fixed α ∈ (0, 1). This result is interesting since other similar range query problems based on frequency have nearly logarithmic lower bounds on query time when restricted to linear space.
Abstract. Data structures for similarity search are commonly evaluated on data in vector spaces, but distance-based data structures are also applicable to non-vector spaces with no natural concept of dimensionality. The intrinsic dimensionality statistic of Chávez and Navarro provides a way to compare the performance of similarity indexing and search algorithms across different spaces, and predict the performance of index data structures on non-vector spaces by relating them to equivalent vector spaces. We characterise its asymptotic behaviour, and give experimental results to calibrate these comparisons.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.