Network Traffic Classification with Improved Random Forest

Wang, Chao; Xu, Tongge; Qin, Xi

doi:10.1109/cis.2015.27

Cited by 23 publications

(7 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Input: Imbalanced train set S, scaling factor K, instance hardness threshold IH′, and sample threshold UB Output: New train set S N (1) Step1: Distinguish between easy sets and difcult sets for each sample∈ S do (2) Compute its K nearest neighbors and IH if IH > IH′ then (3) Put the samples into the difcult set (4) end (5) end (6) Difcult set S D and easy set S E � S − S D (7) Step2: Compress the majority samples in the difcult set by the cluster centroid (8) Take all the majority samples from S D and set it as S Maj (9) Use the K-means algorithm with K cluster (10) Use the coordinates of K cluster centroids and replace the majority samples in S Maj (11) Compressed the majority sample set S Maj (12) Step3: Sample the minority samples in the difcult set using the SMOTE algorithm (13) Take all the majority samples from S D and set it as S Min (14) for each sample ∈ S Min do (15) Using SMOTE sampling, the sampling threshold is set to UB (16) Putting new samples into S Z (17) end (18) Step : Merge sample sets (19) Precision is the ratio of the number of samples with positive real values to the number of samples predicted to be positive, which can represent the ability of the model to predict positive samples as follows:…”

Section: Evaluation Metrics and Baseline Methodsmentioning

confidence: 99%

“…Te SD sampling algorithm combines oversampling and undersampling methods and considers the spatial distribution of samples during sampling, which overcomes the overgeneralization problem of the SMOTE algorithm to some extent. (2) We propose a two-layer structure combined with XGBoost [10] and the random forest [11] to realize multiclassifcation of trafc, which improves the detection rate and generalization ability of the model. (3) We evaluate the performance of the SD sampling algorithm and the proposed classifcation model using the CICIDS2017 dataset [12].…”

Section: Key Contributions and Papermentioning

confidence: 99%

See 1 more Smart Citation

Network Traffic Classification Based on SD Sampling and Hierarchical Ensemble Learning

Qin,

Han,

Wang

et al. 2023

Security and Communication Networks

View full text Add to dashboard Cite

With the increase in cyber threats in recent years, there have been more forms of demand for network security protection measures. Network traffic classification technology is used to adapt to the dynamic threat environment. However, network traffic has a natural unbalanced class distribution problem, and the single model leads to the low accuracy and high false-positive rate of the traditional detection model. Given the above two problems, this paper proposes a new dataset balancing method named SD sampling based on the SMOTE algorithm. Different from the SMOTE algorithm, this method divides the sample into two types that are easy and difficult to classify and only balances the difficult-to-classify sample, which not only overcomes the SMOTE’s overgeneralization but also combines the idea of oversampling and undersampling. In addition, a two-layer structure combined with XGBoost and the random forest is proposed for multiclassification of anomalous traffic, since using a hierarchical structure can better classify minority abnormal traffic. This paper conducts experiments on the CICIDS2017 dataset. The results show that the classification accuracy of the proposed model is more than 99.70% and that the false-positive rate is less than 0.34%, indicating that the proposed model is better than traditional models.

show abstract

Section: Evaluation Metrics and Baseline Methodsmentioning

confidence: 99%

Section: Key Contributions and Papermentioning

confidence: 99%

Network Traffic Classification Based on SD Sampling and Hierarchical Ensemble Learning

Qin,

Han,

Wang

et al. 2023

Security and Communication Networks

View full text Add to dashboard Cite

show abstract

“…In order to verify the advantages of the 1DCAE-IndRNN method proposed in this paper for malware traffic detection, two classical machine learning methods and three deep learning methods are selected from the literature for experimental comparison. Classical machine learning methods include random forest (RF) [16] and XGBoost [17]. Deep learning methods include deep neural networks (DNN) [18], recurrent neural networks (RNN) [19], and long short-term memory (LSTM) [20].…”

Section: Comparison With Other Methodsmentioning

confidence: 99%

Calibrating Network Traffic with One-Dimensional Convolutional Neural Network with Autoencoder and Independent Recurrent Neural Network for Mobile Malware Detection

Wei

Zhang

et al. 2021

Security and Communication Networks

View full text Add to dashboard Cite

In response to the surging challenge in the number and types of mobile malware targeting smart devices and their sophistication in malicious behavior camouflage, we propose to compose a traffic behavior modeling method based on one-dimensional convolutional neural network with autoencoder and independent recurrent neural network (1DCAE-IndRNN) for mobile malware detection. The design solves the problem that most existing approaches for mobile malware traffic detection struggle with capturing the network traffic dynamics and the sequential characteristics of anomalies in the traffic. We reconstruct and apply the one-dimensional convolutional neural network to extract local features from multiple network flows. The autoencoder is applied to digest the principal traffic features from the neural network and is integrated into the independent recurrent neural network construction to highlight the sequential relationship between the highly significant features. In addition, the Softmax function with the LReLU activation function is adjusted and embedded to the neurons of the independent recurrent neural network to effectively alleviate the problem of unstable training. We conduct a series of experiments to evaluate the effectiveness of the proposed method and its performance for the 1DCAE-IndRNN-integrated detection procedure. The detection results of the public Android malware dataset CICAndMal2017 show that the proposed method achieves up to 98% detection accuracy and recall rates with clear advantages over other benchmark methods.

show abstract

“…In [14], a Naive Bayes technique was used for cell classification based on their network traffic patterns. Also, a random forest-based approach was used for traffic pattern detection in the application layer [15]. Finally, in [4], the authors proposed an approach that combined unsupervised and supervised techniques for the analysis of the performance of a mobile network by monitoring the real-time traffic to detect possible changes.…”

Section: Related Workmentioning

confidence: 99%

A deep learning framework for industrial wireless networks

Hany

2023

View full text Add to dashboard Cite

In this report, we introduce a framework to analyze, monitor, and identify the state of industrial wireless networks and their impact on industrial use cases. This framework is based on a deep learning approach for modeling the interactions between the wireless network and industrial use cases. The framework uses the information from different system layers including the spectrum measurements, physical layer metrics, network layer packets, and application layer production-related metrics in order to study the industrial wireless network behavior. The output of the framework can be generally used to improve system management and optimization functions.

show abstract

Network Traffic Classification with Improved Random Forest

Cited by 23 publications

References 18 publications

Network Traffic Classification Based on SD Sampling and Hierarchical Ensemble Learning

Network Traffic Classification Based on SD Sampling and Hierarchical Ensemble Learning

Calibrating Network Traffic with One-Dimensional Convolutional Neural Network with Autoencoder and Independent Recurrent Neural Network for Mobile Malware Detection

A deep learning framework for industrial wireless networks

Contact Info

Product

Resources

About