We propose a version of cache oblivious search trees which is simpler than the previous proposal of Bender, Demaine and Farach-Colton and has the same complexity bounds. In particular, our data structure avoids the use of weight balanced B-trees, and can be implemented as just a single array of data elements, without the use of pointers. The structure also improves space utilization.<br /> <br />For storing n elements, our proposal uses (1+epsilon)n times the element size of memory, and performs searches in worst case O(log_B n) memory transfers, updates in amortized O((log^2 n)/(epsilon B)) memory transfers, and range queries in worst case O(log_B n + k/B) memory transfers, where k is the size of the output.<br /> <br />The basic idea of our data structure is to maintain a dynamic binary tree of height log n + O(1) using existing methods, embed this tree in a static binary tree, which in turn is embedded in an array in a cache oblivious fashion, using the van Emde Boas layout of Prokop.<br /> <br />We also investigate the practicality of cache obliviousness in the area of search trees, by providing an empirical comparison of different methods for laying out a search tree in memory.<br /> <br />The source code of the programs, our scripts and tools, and the data we present here are available online under ftp.brics.dk/RS/01/36/Experiments/.
The triplet and quartet distances are distance measures to compare two rooted and two unrooted trees, respectively. The leaves of the two trees should have the same set of n labels. The distances are defined by enumerating all subsets of three labels (triplets) and four labels (quartets), respectively, and counting how often the induced topologies in the two input trees are different. In this paper we present efficient algorithms for computing these distances. We show how to compute the triplet distance in time O(n log n) and the quartet distance in time O(dn log n), where d is the maximal degree of any node in the two trees. Within the same time bounds, our framework also allows us to compute the parameterized triplet and quartet distances, where a parameter is introduced to weight resolved (binary) topologies against unresolved (non-binary) topologies. The previous best algorithm for computing the triplet and parameterized triplet distances have O(n 2 ) running time, while the previous best algorithms for computing the quartet distance include an O(d 9 n log n) time algorithm and an O(n 2.688 ) time algorithm, where the latter can also compute the parameterized quartet distance. Since d ≤ n, our algorithms improve on all these algorithms.
Evolutionary trees describing the relationship for a set of species are central in evolutionary biology, and quantifying differences between evolutionary trees is therefore an important task. The quartet distance is a distance measure between trees previously proposed by Estabrook, McMorris, and Meacham. The quartet distance between two unrooted evolutionary trees is the number of quartet topology differences between the two trees, where a quartet topology is the topological subtree induced by four species. In this paper we present an algorithm for computing the quartet distance between two unrooted evolutionary trees of n species, where all internal nodes have degree three, in time O(n log n). The previous best algorithm for the problem uses time O(n 2 ). Introduction.The evolutionary relationship for a set of species can be described by an evolutionary tree, which is a rooted tree where the leaves correspond to the species, and the internal nodes correspond to speciation events, i.e. the points in time where the evolution has diverged in different directions. The direction of the evolution is described by the location of the root, which corresponds to the most recent common ancestor for all the species, and the duration of evolutionary periods is described by assigning lengths to the edges. The true evolutionary tree for a set of species is most often unknown; estimating it from obtainable information about the species, e.g. genomic data, is of great interest in evolutionary biology. The problem of computationally estimating aspects of the true evolutionary tree requires a model describing how to use the available information about the species in question. Given a model, the problem of estimating aspects of the true evolutionary tree is referred to as constructing the evolutionary tree in that model. Many models and construction methods are available, see Chapter 17 of [11] for an overview.An important aspect of the true evolutionary tree for a set of species is its undirected tree topology induced by ignoring the location of the root and the length of the edges. Many models and methods are concerned with estimating just this tree topology, usually under the further assumption that all internal nodes have degree three. We say that such models and methods are concerned with constructing the unrooted evolutionary tree of
We analyze the problem of sparse-matrix dense-vector multiplication (SpMV) in the I/O model. In the SpMV, the objective is to compute y = Ax, where A is a sparse matrix and x and y are vectors. We give tight upper and lower bounds on the number of block transfers as a function of the sparsity k, the number of nonzeros in a column of A.Parameter k is a knob that bridges the problems of permuting (k = 1) and dense matrix multiplication (k = N ). When the nonzero elements of A are stored in column-major order,
For a set of points in the plane and a fixed integer k > 0, the Yao graph Y k partitions the space around each point into k equiangular cones of angle θ = 2π/k, and connects each point to a nearest neighbor in each cone. It is known for all Yao graphs, with the sole exception of Y5, whether or not they are geometric spanners. In this paper we close this gap by showing that for odd k ≥ 5, the spanning ratio of Y k is at most 1/(1 − 2 sin(3θ/8)), which gives the first constant upper bound for Y5, and is an improvement over the previous bound of 1/(1 − 2 sin(θ/2)) for odd k ≥ 7. We further reduce the upper bound on the spanning ratio for Y5 from 10.9 to 2 + √ 3 ≈ 3.74, which falls slightly below the lower bound of 3.79 established for the spanning ratio of Θ5 (Θ-graphs differ from Yao graphs only in the way they select the closest neighbor in each cone). number of cones. We also give a lower bound of 2.87 on the spanning ratio of Y5. Finally, we revisit the Y6 graph, which plays a particularly important role as the transition between the graphs (k > 6) for which simple inductive proofs are known, and the graphs (k ≤ 6) whose best spanning ratios have been established by complex arguments. Here we reduce the known spanning ratio of Y6 from 17.6 to 5.8, getting closer to the spanning ratio of 2 established for Θ6.
We adapt the distribution sweeping method to the cache oblivious model. Distribution sweeping is the name used for a general approach for divide-and-conquer algorithms where the combination of solved subproblems can be viewed as a merging process of streams. We demonstrate by a series of algorithms for specific problems the feasibility of the method in a cache oblivious setting. The problems all come from computational geometry, and are: orthogonal line segment intersection reporting, the all nearest neighbors problem, the 3D maxima problem, computing the measure of a set of axis-parallel rectangles, computing the visibility of a set of line segments from a point, batched orthogonal range queries, and reporting pairwise intersections of axis-parallel rectangles. Our basic building block is a simplified version of the cache oblivious sorting algorithm Funnelsort of Frigo et al., which is of independent interest.
We adapt the distribution sweeping method to the cache oblivious model. Distribution sweeping is the name used for a general approach for divide-and-conquer algorithms where the combination of solved subproblems can be viewed as a merging process of streams. We demonstrate by a series of algorithms for specific problems the feasibility of the method in a cache oblivious setting. The problems all come from computational geometry, and are: orthogonal line segment intersection reporting, the all nearest neighbors problem, the 3D maxima problem, computing the measure of a set of axis-parallel rectangles, computing the visibility of a set of line segments from a point, batched orthogonal range queries, and reporting pairwise intersections of axis-parallel rectangles. Our basic building block is a simplified version of the cache oblivious sorting algorithm Funnelsort of Frigo et al., which is of independent interest.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.