Abstract. Spectral clustering algorithm based on the theory of spectrum, its meaning is the optimal clustering problem into graph partitioning problem is a point of clustering algorithms can be high-dimensional data set cluster after dimensionality reduction. Greatly reducing the time of clustering. Compared with the traditional clustering algorithm, spectral clustering which can have the advantage of clustering and converge to the global optimal solution in the sample space of arbitrary shape. However, the prevalence of large data sets are in the real world, when we want to clustering the spectral of large data sets, because the data is too large, the convergence rate will slow down, if not impossible to obtain results within the stipulated time we give us a lot of problems cluster. Thus, this paper based on Hadoop cloud platform to achieve large-scale clustering high-dimensional data sets. Experiments show that: spectral clustering algorithm after the parallel deployments running on Hadoop clusters, with good speedup and good scalability.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.