Rule-based classification is considered an important task of data classification. The ant-mining rule-based classification algorithm, inspired from the ant colony optimization algorithm, shows a comparable performance and outperforms in some application domains to the existing methods in the literature. One problem that often arises in any rule-based classification is the overfitting problem. Rule pruning is a framework to avoid overfitting. Furthermore, we find that the influence of rule pruning in ant-miner classification algorithms is equivalent to that of local search in stochastic methods when they aim to search for more improvement for each candidate solution. In this paper, we review the history of the pruning techniques in ant-miner and its variants. These techniques are classified into post-pruning, pre-pruning and hybrid-pruning. In addition, we compare and analyse the advantages and disadvantages of these methods. Finally, future research direction to find new hybrid rule pruning techniques are provided.
Data clustering is used in a number of fields including statistics, bioinformatics, machine learning exploratory data analysis, image segmentation, security, medical image analysis, web handling and mathematical programming. Its role is to group data into clusters with high similarity within clusters and with high dissimilarity between clusters. This paper reviews the problems that affect clustering performance for deterministic clustering and stochastic clustering approaches. In deterministic clustering, the problems are caused by sensitivity to the number of provided clusters. In stochastic clustering, problems are caused either by the absence of an optimal number of clusters or by the projection of data. The review is focused on ant-based sorting and ACO-based clustering which have problems of slow convergence, un-robust results and local optima solution. The results from this review can be used as a guide for researchers working in the area of data clustering as it shows the strengths and weaknesses of using both clustering approaches.
In this study, a hybrid rule-based classifier namely, ant colony optimization/genetic algorithm ACO/GA is introduced to improve the classification accuracy of Ant-Miner classifier by using GA. The Ant-Miner classifier is efficient, useful and commonly used for solving rulebased classification problems in data mining. Ant-Miner, which is an ACO variant, suffers from local optimization problem which affects its performance. In our proposed hybrid ACO/GA algorithm, the ACO is responsible for generating classification rules and the GA improves the classification rules iteratively using the principles of multi-neighborhood structure (i.e., mutation and crossover) procedures to overcome the local optima problem. The performance of the proposed classifier was tested against other existing hybrid ant-mining classification algorithms namely, ACO/SA and ACO/PSO2 using classification accuracy, the number of discovered rules and model complexity. For the experiment, the 10-fold cross-validation procedure was used on 12 benchmark datasets from the University California Irwine machine learning repository. Experimental results show that the proposed hybridization was able to produce impressive results in all evaluation criteria.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.