Feature Selection for Intrusion Detection Using Random Forest

Hasan, Md. Al Mehedi; Nasser, Mohammed; Ahmad, Shamim; Molla, Khademul Islam

doi:10.4236/jis.2016.73009

Cited by 106 publications

(59 citation statements)

References 13 publications

(21 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The KDD Cup99 and NSL-KDD datasets include four categories of attack [24]. The Table 1 shows the number of rows for each category: UNSW-NB15: The network packets of this dataset were collected by the IXIA Perfect Storm tool in the Cyber Range Lab of the Australian Centre for Cyber Security (ACCS) to generate a hybrid combination of real-life and contemporary synthetic attack behaviors.…”

Section: Datasetmentioning

confidence: 99%

A Machine Learning-Based Lightweight Intrusion Detection System for the Internet of Things

Fenanir¹,

Semchedine²,

Baadache³

2019

RIA

View full text Add to dashboard Cite

The Internet of Things (IoT) is vulnerable to various attacks, due to the presence of tiny computing devices. To enhance the security of the IoT, this paper builds a lightweight intrusion detection system (IDS) based on two machine learning techniques, namely, feature selection and feature classification. The feature selection was realized by the filter-based method, thanks to its relatively low computing cost. The feature classification algorithm for our system was identified through comparison between logistic regression (LR), naive Bayes (NB), decision tree (DT), random forest (RF), k-nearest neighbor (KNN), support vector machine (SVM) and multilayer perceptron (MLP). Finally, the DT algorithm was selected for our system, owing to its outstanding performance on several datasets. The research results provide a guide on choosing the optimal feature selection method for machine learning.

show abstract

Section: Datasetmentioning

confidence: 99%

A Machine Learning-Based Lightweight Intrusion Detection System for the Internet of Things

Fenanir¹,

Semchedine²,

Baadache³

2019

RIA

View full text Add to dashboard Cite

show abstract

“…Maximum number of features used can be reduced up to √ where A represents number of features of the dataset used. Various researchers [23][24] proved that classifier can achieve better accuracy if less number of features are used with reduced processing time. Various feature reduction techniques are used to improve performance of classifiers.…”

Section: Feature Selectionmentioning

confidence: 99%

“…It avoids over fitting as features and data are randomly selected, it also handles missing values from data. Random forest is best algorithm to be used in distributed environment [24].…”

Section: Random Forest (Rf)mentioning

confidence: 99%

Hybrid Architecture for Distributed Intrusion Detection System

Khonde¹,

Ulagamuthalvi²

2019

ISI

View full text Add to dashboard Cite

In the field of information security, attack detection and protection of information from intruders become a new area of research now a days. Due to ever changing technologies and modern methodologies intruders use polymorphic mechanism to deception attack. Various attacks like distributed denial of service, goldeneye, user to root, local to user, remote login become the great threat to the network. To take care of information utmost care is taken to provide network security with the help of various Intrusion Detection System (IDS). IDS helps to detect the threats to the network and can provide various strategies to avoid them. Most of the IDS work intelligently to detect the malicious activities or any abnormal behavior in the network. It leads to the detection of attack and prevention actions can be taken to protect information and provide security to the network. This paper presents an intelligent ID which monitors the real time network traffic to observe the behavior of packets. On the basis of observation detection is done for malicious or normal packets. Action is taken by administrator to prevent the network once the attack is detected by IDS. For attack detection ensembling of various classifiers is done such as Support Vector Machine, Naï ve Bayes, k Nearest Neighbor, stochastic gradient descent, logistic regression, Random Forest and Decision tree. All classifiers used classification methods to classify packets in malicious and normal category. Preprocessing is done to reduce features for minimizing training time of all classifiers. Variable importance and Gini index techniques are used to reduce features. Reduced features are used by individual classifier to classify packets in proposed hybrid model. Majority algorithm is used to ensemble the results of all individual classifier to give the final class of packet as attack or normal. All the classifiers work in distributed network to classify the attacks. NSL-KDD dataset is used to train the classifiers. Testing of proposed system is done by capturing real time traffic on the network. From results it is observed that ensembling of more classifiers increases the detection accuracy of IDS significantly and reduces the false alarm rate. It also helps in improving the system performance in terms of execution time and detection rate with increased true positive rate.

show abstract

“…Therefore, feature selection has been considered one of the most important steps in building a failure prediction model. There have been many studies to build the failure prediction model using feature selection, most of which have selected features considering the importance of each feature [20][21][22]. Moldovan et al built a failure prediction model using the selected features to improve prediction accuracy and performed feature selection using three algorithms (i.e., random forest, regression analysis, and orthogonal linear transformation) to compare the prediction accuracy of each for the comparative study [20].…”

Section: Introductionmentioning

confidence: 99%

Failure Prediction Model Using Iterative Feature Selection for Industrial Internet of Things

Kwon

Kim

2020

Symmetry

View full text Add to dashboard Cite

This paper presents a failure prediction model using iterative feature selection, which aims to accurately predict the failure occurrences in industrial Internet of Things (IIoT) environments. In general, vast amounts of data are collected from various sensors in an IIoT environment, and they are analyzed to prevent failures by predicting their occurrence. However, the collected data may include data irrelevant to failures and thereby decrease the prediction accuracy. To address this problem, we propose a failure prediction model using iterative feature selection. To build the model, the relevancy between each feature (i.e., each sensor) and the failure was analyzed using the random forest algorithm, to obtain the importance of the features. Then, feature selection and model building were conducted iteratively. In each iteration, a new feature was selected considering the importance and added to the selected feature set. The failure prediction model was built for each iteration via the support vector machine (SVM). Finally, the failure prediction model having the highest prediction accuracy was selected. The experimental implementation was conducted using open-source R. The results showed that the proposed failure prediction model achieved high prediction accuracy.

show abstract

Feature Selection for Intrusion Detection Using Random Forest

Cited by 106 publications

References 13 publications

A Machine Learning-Based Lightweight Intrusion Detection System for the Internet of Things

A Machine Learning-Based Lightweight Intrusion Detection System for the Internet of Things

Hybrid Architecture for Distributed Intrusion Detection System

Failure Prediction Model Using Iterative Feature Selection for Industrial Internet of Things

Contact Info

Product

Resources

About