Distance-based indexing is a widely used technique for general purpose search. Pivot selection is the most crucial step of bulkloading a metric-space indexing tree. Current pivot selection methods are mainly based on linear methods. A non-linear method based on Locally Linear Embedding is proposed.Empirical results demonstrate that the performance of new method is superior to existing methods.
Metric-space indexing is a general method for similarity queries of complex data. The quality of the index tree is a critical factor of the query performance. Bulkloading a metricspace indexing tree can be represented by two recursive steps, pivot selection and data partition, while pivot selection dominants the quality of the index tree. Two heuristics, based on covariance and correlation, for pivot selection are proposed. Empirical results show that their performance is superior or comparable to existing methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.