In the real data world, there are various clustering algorithms available in data mining. The data available from the different data sources may be huge in instances, attributes and in different formats. The clustering algorithms available are assessed based on how the algorithm cluster the given data and find its parametric values. The clustering of data may end in inappropriate results if the algorithm is not chosen wisely. This paper proposes a comparison between diverse clustering algorithms such as K Means clustering, Mini-Batch K Means clustering, Hierarchical clustering, Bagging and Boosting by figuring out clustering strategies using high dimensional datasets on each algorithm above. After the process of data cleaning in dataset, we have clustered the datasets and compared the summary of each to showcase the comparability of difference in their strategical values such as Clustering tendency, clustering quality and data driven approach for evaluating the number of clusters, Normalized Mutual Information (NMI) metric and provide an idea to choose the algorithm for clustering the data effectively. And as a result, Local Clustering Coefficient (LCC) with K-means clustering bunching method performs better than the other clustering algorithms and the results are reported.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.