Abstract. This note presents a simplification and generalization of an algorithm for searching kdimensional trees for nearest neighbors reported by . If the distance between records is measured using Lz, the Euclidean norm, the data structure used by the algorithm to determine the bounds of the search space can be simplified to a single number. Moreover, because distance measurements in L2 are rotationally invariant, the algorithm can be generalized to allow a partition plane to have an arbitrary orientation, rather than insisting that it be perpendicular to a coordinate axis, as in the original algorithm. When a k-dimensional tree is built, this plane can be found from the principal eigenvector of the covariance matrix of the records to be partitioned. These techniques and others yield variants of k-dimensional trees customized for specific applications.It is wrong to assume that k-dimensional trees guarantee that a nearest-neighbor query completes in logarithmic expected time. For small k, logarithmic behavior is observed on all but tiny trees. However, for larger k, logarithmic behavior is achievable only with extremely large numbers of records. For k = 16, a search of a k-dimensional tree of 76,000 records examines almost every record.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.