“…For modeling the behavior of a swarm, the techniques are made up of animals and insects, such as bees, ants, birds, fishes, and so on [74,77]. Most recent studies used swarm intelligence to solve problematic real-world problems such as networking, traffic routing, robotics, economics, industry, games, etc.…”
Section: Optimization For Objective Function Of Partitioning Clusterimentioning
confidence: 99%
“…However, the optimization still has difficultly avoiding the problems of local minima and early convergence [11,33,101]. Several examples of population-based optimization are reviewed, which are ant colony optimization (ACO), ant lion optimization (ALO), firefly algorithm (FA), and particle swarm optimization (PSO) [11,33,70,71,77,81,83,86,89].…”
Clustering techniques can group genes based on similarity in biological functions. However, the drawback of using clustering techniques is the inability to identify an optimal number of potential clusters beforehand. Several existing optimization techniques can address the issue. Besides, clustering validation can predict the possible number of potential clusters and hence increase the chances of identifying biologically informative genes. This paper reviews and provides examples of existing methods for clustering genes, optimization of the objective function, and clustering validation. Clustering techniques can be categorized into partitioning, hierarchical, grid-based, and density-based techniques. We also highlight the advantages and the disadvantages of each category. To optimize the objective function, here we introduce the swarm intelligence technique and compare the performances of other methods. Moreover, we discuss the differences of measurements between internal and external criteria to validate a cluster quality. We also investigate the performance of several clustering techniques by applying them on a leukemia dataset. The results show that grid-based clustering techniques provide better classification accuracy; however, partitioning clustering techniques are superior in identifying prognostic markers of leukemia. Therefore, this review suggests combining clustering techniques such as CLIQUE and k-means to yield high-quality gene clusters.
“…For modeling the behavior of a swarm, the techniques are made up of animals and insects, such as bees, ants, birds, fishes, and so on [74,77]. Most recent studies used swarm intelligence to solve problematic real-world problems such as networking, traffic routing, robotics, economics, industry, games, etc.…”
Section: Optimization For Objective Function Of Partitioning Clusterimentioning
confidence: 99%
“…However, the optimization still has difficultly avoiding the problems of local minima and early convergence [11,33,101]. Several examples of population-based optimization are reviewed, which are ant colony optimization (ACO), ant lion optimization (ALO), firefly algorithm (FA), and particle swarm optimization (PSO) [11,33,70,71,77,81,83,86,89].…”
Clustering techniques can group genes based on similarity in biological functions. However, the drawback of using clustering techniques is the inability to identify an optimal number of potential clusters beforehand. Several existing optimization techniques can address the issue. Besides, clustering validation can predict the possible number of potential clusters and hence increase the chances of identifying biologically informative genes. This paper reviews and provides examples of existing methods for clustering genes, optimization of the objective function, and clustering validation. Clustering techniques can be categorized into partitioning, hierarchical, grid-based, and density-based techniques. We also highlight the advantages and the disadvantages of each category. To optimize the objective function, here we introduce the swarm intelligence technique and compare the performances of other methods. Moreover, we discuss the differences of measurements between internal and external criteria to validate a cluster quality. We also investigate the performance of several clustering techniques by applying them on a leukemia dataset. The results show that grid-based clustering techniques provide better classification accuracy; however, partitioning clustering techniques are superior in identifying prognostic markers of leukemia. Therefore, this review suggests combining clustering techniques such as CLIQUE and k-means to yield high-quality gene clusters.
“…In 2018, Pacheco et al [189] introduced an automatic clustering algorithm called Anthill which was motivated by the collaborative intelligent behaviour of ants. The proposed algorithm addressed the problem of an automatic grouping which is admittedly considered an NP-difficult problem.…”
In real-world scenarios, identifying the optimal number of clusters in a dataset is a difficult task due to insufficient knowledge. Therefore, the indispensability of sophisticated automatic clustering algorithms for this purpose has been contemplated by some researchers. Several automatic clustering algorithms assisted by quantum-inspired metaheuristics have been developed in recent years. However, the literature lacks definitive documentation of the state-of-the-art quantum-inspired metaheuristic algorithms for automatically clustering datasets. This article presents a brief overview of the automatic clustering process to establish the importance of making the clustering process automatic. The fundamental concepts of the quantum computing paradigm are also presented to highlight the utility of quantum-inspired algorithms. This article thoroughly analyses some algorithms employed to address the automatic clustering of various datasets. The reviewed algorithms were classified according to their main sources of inspiration. In addition, some representative works of each classification were chosen from the existing works. Thirty-six such prominent algorithms were further critically analysed based on their aims, used mechanisms, data specifications, merits and demerits. Comparative results based on the performance and optimal computational time are also presented to critically analyse the reviewed algorithms. As such, this article promises to provide a detailed analysis of the state-of-the-art quantum-inspired metaheuristic algorithms, while highlighting their merits and demerits.
“…Zhou et al [17] Cluster analysis Real life and artificial datasets Fitness function evaluation Pacheco et al [104] Cluster analysis Real life datasets SI Elaziz et al [105] Cluster analysis Real life and artificial datasets Dunn index, SI, DB index and Calinski-Harabasz (CH) index Chowdhury and Das [37] Pattern recognition Real life and artificial datasets Huang's accuracy measure Sheng et al [106] Miscellaneous Real life and artificial datasets DB, CH, I-index Zhou et al [107] GPS data based trajectory Real life: Taxi GPS Datasets DB index Agbaje et al [108] Cluster analysis Real life datasets DB and CS indices problem at hand. From this analysis, GA has 887, PSO has 524, DE has 180, FA has 49, and DE has 9 published documents.…”
The application of several swarm intelligence and evolutionary metaheuristic algorithms in data clustering problems has in the past few decades gained wide popularity and acceptance due to their success in solving and finding good quality solutions to a variety of complex real-world optimization problems. Clustering is considered one of the most important data analysis techniques in the domain of data mining. A clustering problem refers to the partitioning of unlabeled data objects into a certain number of clusters based on their attribute values or features, with the objective of maximizing intra-clusters homogeneity and inter-cluster heterogeneity. This paper presents an up-to-date survey of major nature-inspired metaheuristic algorithms that have been employed to solve automatic clustering problems. Further, a comparative study of several modified well-known global metaheuristic algorithms is carried out to solve automatic clustering problems. Also, three hybrid swarm intelligence and evolutionary algorithms, namely, particle swarm differential evolution algorithm, firefly differential evolution algorithm and invasive weed optimization differential evolution algorithm, are proposed to deal with the task of automatic data clustering. In contrast to many of the existing traditional and evolutionary computational clustering techniques, the clustering algorithms presented in this paper do not require any predetermined information or prior-knowledge of the dataset that is to be classified, but rather they are capable of spontaneously identifying the optimal number of partitions of the data points during the course of program execution. Forty-one benchmarked datasets that comprise eleven artificial and thirty real world datasets are collated and utilized to evaluate the performances of the representative nature-inspired clustering algorithms. According to the extensive experimental results, comparisons and statistical significance, the firefly algorithm appeared to be more appropriate for better clustering of both low and high dimensional data objects than were other state-of-the-art algorithms. Further, an experimental study demonstrates the superiority of the three proposed hybrid algorithms over the standard stateof-the-art methods in finding meaningful clustering solutions to the problem at hand.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.