Abstract:The application of several swarm intelligence and evolutionary metaheuristic algorithms in data clustering problems has in the past few decades gained wide popularity and acceptance due to their success in solving and finding good quality solutions to a variety of complex real-world optimization problems. Clustering is considered one of the most important data analysis techniques in the domain of data mining. A clustering problem refers to the partitioning of unlabeled data objects into a certain number of clu… Show more
“…All the different literature and comparative analyses results do point to the fact that the FA is a very efficient and robust metaheuristic algorithm for solving real-world problems. More so, the findings from Ezugwu [41] and Agbaje et al [51] on the promising performance of the FA for automatic clustering compelled us to go into this research to investigate further the superior performances of both the improved nutation based firefly algorithm and its hybrid variants for automatic data clustering.…”
Section: Related Workmentioning
confidence: 99%
“…Ezugwu [41] presented an extensive survey study of major nature-inspired metaheuristic algorithms that have been applied to solve automatic data clustering problems. Furthermore, the author carried out a comparative study of several modified well-known global metaheuristic algorithms to solve automatic clustering problems, of which three hybrid swarm intelligence and evolutionary algorithms, namely, particle swarm differential evolution algorithm, firefly differential evolution algorithm and invasive weed optimization differential evolution algorithm, were employed to deal with the task of automatic clustering.…”
Section: Related Workmentioning
confidence: 99%
“…No record of a similar research focus in the literature exist as of the time of writing this paper. [41]) on the current ( ) Apply ABC updating formula (see [52]) on the current n ( ) Apply IWO updating formula (see [41]) on the current ( ) Apply TLBO updating formula (see [53]) on the current ( ) Update the global best solution in the whole population Evaluate the fitness value of each individual candidate solution Update the new value as the global best…”
Section: A Firefly-based Hybrids and Clustering Problem Descriptionmentioning
confidence: 99%
“…The first stage engages the modified FA algorithm by randomly generating initial swarm, where the number of fireflies equal to the number of clusters and the swarm population is uniformly distributed across the dimension of the dataset, which in this case is the clustering problem search space. After the swarm initialization, the next task is the evaluation of the best swarm according to the fitness function determined by the DB and CS validity indices [41]. Note that the best swarm position, for example, represents the data point that achieves the minimum distance to the swarm from its previous searches.…”
Section: End For End For End While Endmentioning
confidence: 99%
“…The parameter configurations of ABC, IWO, PSO, and TLBO are further detailed in Table 2a. The details of the remaining datasets namely, Jain dataset, Pathbased dataset, Spiral dataset, and Thyroid can be obtained in [39] for Jain, [40] for both Pathbased and Spiral, and [41] for the Thyroid dataset. The twelve datasets configurations are summarized in Table 2b.…”
In cluster analysis, the goal has always been to extemporize the best possible means of automatically determining the number of clusters. However, because of lack of prior domain knowledge and uncertainty associated with data objects characteristics, it is challenging to choose an appropriate number of clusters, especially when dealing with data objects of high dimensions, varying data sizes, and density. In the last few decades, different researchers have proposed and developed several nature-inspired metaheuristic algorithms to solve data clustering problems. Many studies have shown that the firefly algorithm is a very robust, efficient and effective nature-inspired swarm intelligence global search technique, which has been successfully applied to solve diverse NP-hard optimization problems. However, the diversification search process employed by the firefly algorithm can lead to reduced speed and convergence rate for large-scale optimization problems. Thus this study investigates the application of four hybrid firefly algorithms to the task of automatic clustering of high density and large-scaled unlabelled datasets. In contrast to most of the existing classical heuristic-based data clustering analyses techniques, the proposed hybrid algorithms do not require any prior knowledge of the data objects to be classified. Instead, the hybrid methods automatically determine the optimal number of clusters empirically and during the program execution. Two well-known clustering validity indices, namely the Compact-Separated and Davis-Bouldin indices, are employed to evaluate the superiority of the implemented firefly hybrid algorithms. Furthermore, twelve standard ground truth clustering datasets from the UCI Machine Learning Repository are used to evaluate the robustness and effectiveness of the algorithms against those of the classical swarm optimization algorithms and other related clustering results from the literature. The experimental results show that the new clustering methods depict high superiority in comparison with existing standalone and other hybrid metaheuristic techniques in terms of clustering validity measures.
“…All the different literature and comparative analyses results do point to the fact that the FA is a very efficient and robust metaheuristic algorithm for solving real-world problems. More so, the findings from Ezugwu [41] and Agbaje et al [51] on the promising performance of the FA for automatic clustering compelled us to go into this research to investigate further the superior performances of both the improved nutation based firefly algorithm and its hybrid variants for automatic data clustering.…”
Section: Related Workmentioning
confidence: 99%
“…Ezugwu [41] presented an extensive survey study of major nature-inspired metaheuristic algorithms that have been applied to solve automatic data clustering problems. Furthermore, the author carried out a comparative study of several modified well-known global metaheuristic algorithms to solve automatic clustering problems, of which three hybrid swarm intelligence and evolutionary algorithms, namely, particle swarm differential evolution algorithm, firefly differential evolution algorithm and invasive weed optimization differential evolution algorithm, were employed to deal with the task of automatic clustering.…”
Section: Related Workmentioning
confidence: 99%
“…No record of a similar research focus in the literature exist as of the time of writing this paper. [41]) on the current ( ) Apply ABC updating formula (see [52]) on the current n ( ) Apply IWO updating formula (see [41]) on the current ( ) Apply TLBO updating formula (see [53]) on the current ( ) Update the global best solution in the whole population Evaluate the fitness value of each individual candidate solution Update the new value as the global best…”
Section: A Firefly-based Hybrids and Clustering Problem Descriptionmentioning
confidence: 99%
“…The first stage engages the modified FA algorithm by randomly generating initial swarm, where the number of fireflies equal to the number of clusters and the swarm population is uniformly distributed across the dimension of the dataset, which in this case is the clustering problem search space. After the swarm initialization, the next task is the evaluation of the best swarm according to the fitness function determined by the DB and CS validity indices [41]. Note that the best swarm position, for example, represents the data point that achieves the minimum distance to the swarm from its previous searches.…”
Section: End For End For End While Endmentioning
confidence: 99%
“…The parameter configurations of ABC, IWO, PSO, and TLBO are further detailed in Table 2a. The details of the remaining datasets namely, Jain dataset, Pathbased dataset, Spiral dataset, and Thyroid can be obtained in [39] for Jain, [40] for both Pathbased and Spiral, and [41] for the Thyroid dataset. The twelve datasets configurations are summarized in Table 2b.…”
In cluster analysis, the goal has always been to extemporize the best possible means of automatically determining the number of clusters. However, because of lack of prior domain knowledge and uncertainty associated with data objects characteristics, it is challenging to choose an appropriate number of clusters, especially when dealing with data objects of high dimensions, varying data sizes, and density. In the last few decades, different researchers have proposed and developed several nature-inspired metaheuristic algorithms to solve data clustering problems. Many studies have shown that the firefly algorithm is a very robust, efficient and effective nature-inspired swarm intelligence global search technique, which has been successfully applied to solve diverse NP-hard optimization problems. However, the diversification search process employed by the firefly algorithm can lead to reduced speed and convergence rate for large-scale optimization problems. Thus this study investigates the application of four hybrid firefly algorithms to the task of automatic clustering of high density and large-scaled unlabelled datasets. In contrast to most of the existing classical heuristic-based data clustering analyses techniques, the proposed hybrid algorithms do not require any prior knowledge of the data objects to be classified. Instead, the hybrid methods automatically determine the optimal number of clusters empirically and during the program execution. Two well-known clustering validity indices, namely the Compact-Separated and Davis-Bouldin indices, are employed to evaluate the superiority of the implemented firefly hybrid algorithms. Furthermore, twelve standard ground truth clustering datasets from the UCI Machine Learning Repository are used to evaluate the robustness and effectiveness of the algorithms against those of the classical swarm optimization algorithms and other related clustering results from the literature. The experimental results show that the new clustering methods depict high superiority in comparison with existing standalone and other hybrid metaheuristic techniques in terms of clustering validity measures.
In this article, quantum inspired incarnations of two swarm based meta‐heuristic algorithms, namely, Crow Search Optimization Algorithm and Intelligent Crow Search Optimization Algorithm have been proposed for automatic clustering of colour images. The performance and effectiveness of the proposed algorithms have been judged by experimenting on 15 Berkeley images and five publicly available real life images of different sizes. The validity of the proposed algorithms has been justified with the help of four different cluster validity indices, namely, Pakhira Bandyopadhyay Maulik, I‐index, Silhouette and CS‐measure. Moreover, Sobol's sensitivity analysis has been performed to tune the parameters of the proposed algorithms. The experimental results prove the superiority of proposed algorithms with respect to optimal fitness, computational time, convergence rate, accuracy, robustness,
t
‐test and Friedman test. Finally, the efficacy of the proposed algorithms has been proved with the help of quantitative evaluation of segmentation evaluation metrics.
This paper contains a proposal to assign points to clusters, represented by their centers, based on weighted expected distances in a cluster analysis context. The proposed clustering algorithm has mechanisms to create new clusters, to merge two nearby clusters and remove very small clusters, and to identify points 'noise' when they are beyond a reasonable neighborhood of a center or belong to a cluster with very few points. The presented clustering algorithm is evaluated using four randomly generated and two well-known data sets. The obtained clustering is compared to other clustering algorithms through the visualization of the clustering, the value of the DB validity measure and the value of the sum of within-cluster distances. The preliminary comparison of results shows that the proposed clustering algorithm is very efficient and effective.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.