Software fault prediction (SFP) is a complex problem that meets developers in the software development life cycle. Collecting data from real software projects, either while the development life cycle or after lunch the product, is not a simple task, and the collected data may suffer from imbalance data distribution problem. In this research, we proposed an Enhanced Binary Moth Flame Optimization (EBMFO) with Adaptive synthetic sampling (ADASYN) to predict software faults. BMFO is employed as a wrapper feature selection, while ADASYN enhances the input dataset and address the imbalanced dataset. Converting MFO algorithm from a continues version to the binary version using transfer functions (TFs) from two different groups (S-shape and V-shape) is investigated in this work and proposed an EBFMFO version. Fifteen real projects data obtained from PROMISE repository are employed in this work. Three different classifiers are used: the k-nearest neighbors (k-NN), Decision Trees (DT), and Linear discriminant analysis (LDA). The reported results demonstrate that the proposed EBMFO enhances the overall performance of classifiers and outperforms the results in the literature and show the importance of TF for feature selection algorithms.
Software fault prediction (SFP) is a challenging process that any successful software should go through it to make sure that all software components are free of faults. In general, soft computing and machine learning methods are useful in tackling this problem. The size of fault data is usually huge since it is obtained from mining software historical repositories. This data consists of a large number of features (metrics). Determining the most valuable features (i.e., Feature Selection (FS) is an excellent solution to reduce data dimensionality. In this paper, we proposed an enhanced version of the Whale Optimization Algorithm (WOA) by combining it with a single point crossover method. The proposed enhancement helps the WOA to escape from local optima by enhancing the exploration process. Five different selection methods are employed: Tournament, Roulette wheel, Linear rank, Stochastic universal sampling, and random-based. To evaluate the performance of the proposed enhancement, 17 available SFP datasets are adopted from the PROMISE repository. The deep analysis shows that the proposed approach outperformed the original WOA and the other six state-of-the-art methods, as well as enhanced the overall performance of the machine learning classifier.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.