In recent years, advanced threat attacks are increasing, but the traditional network intrusion detection system based on feature filtering has some drawbacks which make it difficult to find new attacks in time. This paper takes NSL-KDD data set as the research object, analyses the latest progress and existing problems in the field of intrusion detection technology, and proposes an adaptive ensemble learning model. By adjusting the proportion of training data and setting up multiple decision trees, we construct a MultiTree algorithm. In order to improve the overall detection effect, we choose several base classifiers, including decision tree, random forest, kNN, DNN, and design an ensemble adaptive voting algorithm. We use NSL-KDD Test+ to verify our approach, the accuracy of the MultiTree algorithm is 84.2%, while the final accuracy of the adaptive voting algorithm reaches 85.2%. Compared with other research papers, it is proved that our ensemble model effectively improves detection accuracy. In addition, through the analysis of data, it is found that the quality of data features is an important factor to determine the detection effect. In the future, we should optimize the feature selection and preprocessing of intrusion detection data to achieve better results.
Case-based reasoning (CBR) has been used in various problem-solving areas such as financial forecasting, credit analysis and medical diagnosis. However, conventional CBR has the limitation that it has no criterion for choosing the nearest cases based on the probabilistic similarity of cases. It uses a fixed number of neighbors without considering an optimal number for each target case, so it does not guarantee optimal similar neighbors for various target cases. This leads to the weakness of lowering predictability due to deviation from desired similar neighbors. In this paper we suggest a new case extraction technique called statistical case-based reasoning. The main idea involves a dynamic adaptation of the optimal number of neighbors by considering the distribution of distances between potential similar neighbors for each target case. In order to do this, our technique finds the optimal distance threshold and selects similar neighbors satisfying the distance threshold criterion. We apply this new method to five real-life medical data sets and compare the results with those of the statistical method, logistic regression; we also compare the results with the learning methods C5.0, CART, neural networks and conventional CBR. The results of this paper show that the proposed technique outperforms those of many other methods, it overcomes the limitation of conventional CBR, and it provides improved classification accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.