Range queries over multidimensional data are an important part of database workloads in many applications. Their execution may be accelerated by using multidimensional index structures (MDIS), such as kd-trees or R-trees. As for most index structures, the usefulness of this approach depends on the selectivity of the queries, and common wisdom told that a simple scan beats MDIS for queries accessing more than 15%-20% of a dataset. However, this wisdom is largely based on evaluations that are almost two decades old, performed on data being held on disks, applying IO-optimized data structures, and using single-core systems. The question is whether this rule of thumb still holds when multidimensional range queries (MDRQ) are performed on modern architectures with large main memories holding all data, multi-core CPUs and data-parallel instruction sets.In this paper, we study the question whether and how much modern hardware influences the performance ratio between index structures and scans for MDRQ. To this end, we conservatively adapted three popular MDIS, namely the R * -tree, the kd-tree, and the VA-file, to exploit features of modern servers and compared their performance to different flavors of parallel scans using multiple (synthetic and real-world) analytical workloads over multiple (synthetic and real-world) datasets of varying size, dimensionality, and skew. We find that all approaches benefit considerably from using main memory and parallelization, yet to varying degrees. Our evaluation indicates that, on current machines, scanning should be favored over parallel versions of classical MDIS even for very selective queries.
KEYWORDSMultidimensional Index Structures, Modern Hardware 1 Queries over high-dimensional datasets or using similarity predicates are out of scope of this work; for supporting such use cases, we refer the reader to excellent surveys, like [6]. Hard Disk Drive MDIS Multi-Core CPU Main Memory MDIS -one thread -scalar instructions -many threads -scalar/SIMD instructions Figure 1: Classical disk-based set-up for MDIS (left) versus an adaptation to modern hardware (right).contrast, the classical MDIS were designed for row-wise data layouts. Thus, it is time to re-evaluate the performance of MDIS for MDRQ to see if the traditional rule of thumb still holds. Clearly, such a re-evaluation requires an adaptation of the original index structures to the features of modern hardware (see Figure 1) and should be carried out using analytical workloads.In this experimental analysis, we study the question whether and how much the changes in hardware and workloads influence the performance of MDIS compared to sequential scans. To this end, we adapted three popular MDIS to be executed in a parallel and in-memory setting, namely (1) the R * -tree [2], an optimized variant of the R-tree [15], (2) the kd-tree [3], an index structure already originally designed for in-memory computations, and (3) the VAfile [41], which can be considered as a mixture between a MDIS and a sequential scan. Our adaptation is conser...