In a large-scale data environment, the ''curse of dimensionality'' of high-dimensional feature spaces and the large amount of noisy data make the efficiency and accuracy of intrusion detection systems (IDSs) significantly decrease. To address these challenges, the underlying algorithm can not only reduce dimensionality, but also remove some redundant and irrelevant noise data from the massive data. Accordingly, herein, an IDS combining deep belief network (DBN) with feature-weighted support vector machines (WSVM) is proposed. First, an adaptive learning rate strategy is applied to promote the training performance of the IDBN, which is used for learning deep features from raw data for reducing dimensionality. Second, the particle swarm optimization algorithm is used to optimize the SVM, followed by the determination of the weights of deep features and the best parameters of the Gaussian kernel, resulting in WSVM which can remove weakly related and redundant features from all IDBN-extracted features. The NSL-KDD dataset was used to validate the IDBN-WSVM model. In particular, the model performance was studied and compared to a model comprising a non-weighted SVM and other machine learning methods. Experimental results demonstrate that IDBN-WSVM is well-suited for designing high-precision classification models. The proposed improved model achieves accuracies of 85.73% and 82.36% in binary-and five-category classification experiments, respectively, which is better than or near state-of-the-art method. The IDBN-WSVM model not only saves training time and testing time on large-scale datasets, but also is more robust and has better performance of generalization than traditional methods, which provides a new research method that achieves high accuracy in intrusion detection tasks.
Owing to the constraints of time and space complexity, network intrusion detection systems (NIDSs) based on support vector machines (SVMs) face the “curse of dimensionality” in a large-scale, high-dimensional feature space. This study proposes a joint training model that combines a stacked autoencoder (SAE) with an SVM and the kernel approximation technique. The training model uses the SAE to perform feature dimension reduction, uses random Fourier features to perform kernel approximation, and then random Fourier mapping is explicitly applied to the sub-sample to generate the random feature space, making it possible to apply a linear SVM to uniformly approximate to the Gaussian kernel SVM. Finally, the SAE performs joint training with the efficient linear SVM. We studied the effects of an SAE structure and a random Fourier feature on classification performance, and compared that performance with that of other training models, including some without kernel approximation. At the same time, we compare the accuracy of the proposed model with that of other models, which include basic machine learning models and the state-of-the-art models in other literatures. The experimental results demonstrate that the proposed model outperforms the previously proposed methods in terms of classification performance and also reduces the training time. Our model is feasible and works efficiently on large-scale datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.