Kmeans algorithm is commonly used in user segmentation in operators data, but its k value is difficult to be identified. Meanwhile, canopy algorithm can help Kmeans algorithm to determine the k value, but it is seriously impacted by the radius. In order to solve the above problems, an improved Canopy-Kmeans algorithm is proposed. Firstly, the initial data will be divided into K1 coarse clusters by using the Canopy algorithm with smaller radius. And then, we will use the split method or merged method to reconstruct the K1 coarse clusters to K2 convergent clusters (K1≫K2). Finally, we can make the final K2 cluster centers be the initial centers on Kmeans algorithm. By the simulation experiment, the improved Canopy-Kmeans algorithm has performed well in running time, clusters result and square error.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.