Software Defect Prediction (SDP) models are built using software metrics derived from software systems. The quality of SDP models depends largely on the quality of software metrics (dataset) used to build the SDP models. High dimensionality is one of the data quality problems that affect the performance of SDP models. Feature selection (FS) is a proven method for addressing the dimensionality problem. However, the choice of FS method for SDP is still a problem, as most of the empirical studies on FS methods for SDP produce contradictory and inconsistent quality outcomes. Those FS methods behave differently due to different underlining computational characteristics. This could be due to the choices of search methods used in FS because the impact of FS depends on the choice of search method. It is hence imperative to comparatively analyze the FS methods performance based on different search methods in SDP. In this paper, four filter feature ranking (FFR) and fourteen filter feature subset selection (FSS) methods were evaluated using four different classifiers over five software defect datasets obtained from the National Aeronautics and Space Administration (NASA) repository. The experimental analysis showed that the application of FS improves the predictive performance of classifiers and the performance of FS methods can vary across datasets and classifiers. In the FFR methods, Information Gain demonstrated the greatest improvements in the performance of the prediction models. In FSS methods, Consistency Feature Subset Selection based on Best First Search had the best influence on the prediction models. However, prediction models based on FFR proved to be more stable than those based on FSS methods. Hence, we conclude that FS methods improve the performance of SDP models, and that there is no single best FS method, as their performance varied according to datasets and the choice of the prediction model. However, we recommend the use of FFR methods as the prediction models based on FFR are more stable in terms of predictive performance.
Feature selection has gained much consideration from scholars working in the domain of machine learning and data mining in recent years. Feature selection is a popular problem in Machine learning with the goal of finding optimal features with increase accuracy. As a result, several studies have been conducted on multi-objective feature selection through numerous multi-objective techniques and algorithms. The objective of this paper is to present a systematic literature review of the challenges and issues of the multi-objective feature selection problem and critically analyses the proposed techniques used to tackle this problem. The conducted review covered all related studies published since 2012 up to 2019. The outcomes of the reviewed of these studies clearly showed that no perfect solution to the multi-objective feature selection problem yet. The authors believed that the conducted review would serve as the main source of the techniques and methods used to resolve the problem of multi-objective feature selection. Furthermore, current challenges and issues are deliberated to find promising research domains for further study.
Despite Grey Wolf Optimizer's (GWO) superior performance in many areas, stagnation in local optima areas may still be a concern. Several significant GWO factors can be explored to enhance the performance of selection in classification, with two conflicting concepts to be considered in using or modeling a metaheuristic method, exploring a search field, and exploiting optimal solutions. Balancing exploration and exploitation in a good manner will improve the search algorithm's performance. To achieve a good balance, this paper proposes a binary hybrid GWO and Harris Hawks Optimization (HHO) to form a memetic approach called HBGWOHHO. The sigmoid transfer function is used to transfer the continuous search space into a binary one to meet the feature selection nature requirement. A wrapper-based k-Nearest neighbor is used to evaluate the goodness of the selected features. To validate the performance of the proposed method, 18 standard UCI benchmark datasets were used. The performance of the proposed hybrid method was compared with Binary Grey Wolf Optimizer (BGWO), Binary Particle Swarm Optimization (BPSO), Binary Harris Hawks Optimizer (BHHO), Binary Genetic Algorithm (BGA) and Binary Hybrid BWOPSO. The findings revealed that the proposed method was effective in improving the performance of the BGWO algorithm. The proposed hybrid method outperforms the BGWO algorithm in terms of accuracy, selected feature size, and computational time. Similarly, compared with BPSO and BGA feature selection algorithms, the proposed HBGWOHHO surpassed them yield better accuracy, the smaller size of selected features in much lower computational time.
Feature selection or dimensionally reduction can be considered as a multi-objective minimization problem with two objectives: minimizing the number of features and minimizing the error rate simultaneously. Despite being a multiobjective problem, most existing approaches treat feature selection as a single-objective optimization problem. Recently, Multiobjective Grey Wolf optimizer (MOGWO) was proposed to solve multi-objective optimization problem. However, MOGWO was originally designed for continuous optimization problems and hence, it cannot be utilized directly to solve multi-objective feature selection problems which are inherently discrete in nature. Therefore, in this research, a binary version of MOGWO based on sigmoid transfer function called BMOGW-S is developed to optimize feature selection problems. A wrapper based Artificial Neural Network (ANN) is used to assess the classification performance of a subset of selected features. To validate the performance of the proposed method, 15 standard benchmark datasets from the UCI repository are employed. The proposed BMOGWO-S was compared with MOGWO with a tanh transfer function and Non-dominated Sorting Genetic Algorithm (NSGA-II) and Multi-objective Particle Swarm Optimization (MOPSO). The results showed that the proposed BMOGWO-S can effectively determine a set of non-dominated solutions. The proposed method outperforms the existing multi-objective approaches in most cases in terms of features reduction as well as classification error rate while benefiting from a lower computational cost.
In recent years, technology has advanced to the fourth industrial revolution (Industry 4.0), where the Internet of things (IoTs), fog computing, computer security, and cyberattacks have evolved exponentially on a large scale. The rapid development of IoT devices and networks in various forms generate enormous amounts of data which in turn demand careful authentication and security. Artificial intelligence (AI) is considered one of the most promising methods for addressing cybersecurity threats and providing security. In this study, we present a systematic literature review (SLR) that categorize, map and survey the existing literature on AI methods used to detect cybersecurity attacks in the IoT environment. The scope of this SLR includes an in-depth investigation on most AI trending techniques in cybersecurity and state-of-art solutions. A systematic search was performed on various electronic databases (SCOPUS, Science Direct, IEEE Xplore, Web of Science, ACM, and MDPI). Out of the identified records, 80 studies published between 2016 and 2021 were selected, surveyed and carefully assessed. This review has explored deep learning (DL) and machine learning (ML) techniques used in IoT security, and their effectiveness in detecting attacks. However, several studies have proposed smart intrusion detection systems (IDS) with intelligent architectural frameworks using AI to overcome the existing security and privacy challenges. It is found that support vector machines (SVM) and random forest (RF) are among the most used methods, due to high accuracy detection another reason may be efficient memory. In addition, other methods also provide better performance such as extreme gradient boosting (XGBoost), neural networks (NN) and recurrent neural networks (RNN). This analysis also provides an insight into the AI roadmap to detect threats based on attack categories. Finally, we present recommendations for potential future investigations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.