<span lang="EN-US">Large datasets have become useful in data mining for processing, storing, and handling vast amounts of data. However, handling and processing large datasets is time-consuming and memory intensive. As a result, the researchers adopted a partitioning strategy to improve controllability and performance and reduce the time and memory required to handle large datasets. Unfortunately, the numerous clustering techniques available in the literature could confuse experts in choosing the best techniques for a given dataset. Furthermore, no clustering technique can tackle all problems, such as cluster structure, noise, or density. To manage large datasets, existing clustering techniques need scalable solutions. Therefore, this paper proposes an ensemble partition-based clustering with a majority voting technique for large dataset partitioning using the aggregation of k-means, k-medoids, fuzzy c-means, expectation-maximization (EM) and density-based spatial clustering of applications with noise (DBSCAN) techniques. These techniques cluster the large dataset individually in the first stage. The final clusters are discovered in the next stage through a majority voting technique among the five clustering algorithms. These five clustering algorithms assigned data instances to the cluster with the most votes. The experimental findings demonstrate that the ensemble partition-based clustering method surpasses the other five clustering algorithms in terms of execution time and accuracy.</span>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.