Network Intrusion Detection in Big Dataset Using Spark

Dahiya, Priyanka; Srivastava, Devesh Kumar

doi:10.1016/j.procs.2018.05.169

Cited by 67 publications

(33 citation statements)

References 1 publication

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The highest accuracy obtained was 85.56% using decision tree that also generated a false alarm rate of 15.78%. As discussed in [22], experimentation was conducted on Apache Spark to improve the accuracy and it can be noted that REP tree model achieved an accuracy of 93.56%. The training time taken was 7.92 seconds to learn 47,342 instances.…”

Section: Introductionmentioning

confidence: 99%

Performance analysis of binary and multiclass models using azure machine learning

Rajagopal

Hareesha

Kundapur

2020

IJECE

View full text Add to dashboard Cite

Network data is expanding and that too at an alarming rate. Besides, the sophisticated attack tools used by hackers lead to capricious cyber threat landscape. Traditional models proposed in the field of network intrusion detection using machine learning algorithms emphasize more on improving attack detection rate and reducing false alarms but time efficiency is often overlooked. Therefore, in order to address this limitation, a modern solution has been presented using Machine Learning-as-a-Service platform. The proposed work analyses the performance of eight two-class and three multiclass algorithms using UNSW NB-15, a modern intrusion detection dataset. 82,332 testing samples were considered to evaluate the performance of algorithms. The proposed two class decision forest model exhibited 99.2% accuracy and took 6 seconds to learn 1,75,341 network instances. Multiclass classification task was also undertaken wherein attack types like generic, exploits, shellcode and worms were classified with a recall percentage of 99%, 94.49%, 91.79% and 90.9% respectively by the multiclass decision forest model that also leapfrogged others in terms of training and execution time.

show abstract

Section: Introductionmentioning

confidence: 99%

Performance analysis of binary and multiclass models using azure machine learning

Rajagopal

Hareesha

Kundapur

2020

IJECE

View full text Add to dashboard Cite

show abstract

“…Various methods have been proposed in the literature for network anomaly detection including standard machine learning classifiers 4–29 and deep learning techniques 30–47 . Muda et al performed clustering before classification and compared the single classifiers with hybrid classifiers.…”

Section: Related Workmentioning

confidence: 99%

“…Dhaliwal et al developed several XGBoost models and obtained 98.70% accuracy on NSL‐KDD dataset 26 . Dahiya and Srivastava compared two dimension reduction algorithms such as canonical correlation analysis and linear discriminant analysis using several classification algorithms and obtained at most 95.53% accuracy rate on UNSW‐NB dataset using canonical correlation analysis with bagging 27 . Verma et al compared several boosting algorithms using NSL‐KDD dataset, and they reached 99.86% accuracy rate using XGBoost with K‐means clustering 28 .…”

Section: Related Workmentioning

confidence: 99%

A deep learning approach with Bayesian optimization and ensemble classifiers for detecting denial of service attacks

Görmez

Aydın

Karademir³

et al. 2020

Int J Communication

View full text Add to dashboard Cite

Summary Detecting malicious behavior is important for preventing security threats in a computer network. Denial of Service (DoS) is among the popular cyber attacks targeted at web sites of high‐profile organizations and can potentially have high economic and time costs. In this paper, several machine learning methods including ensemble models and autoencoder‐based deep learning classifiers are compared and tuned using Bayesian optimization. The autoencoder framework enables to extract new features by mapping the original input to a new space. The methods are trained and tested both for binary and multi‐class classification on Digiturk and Labris datasets, which were introduced recently for detecting various types of DDoS attacks. The best performing methods are found to be ensembles though deep learning classifiers achieved comparable level of accuracy.

show abstract

“…Apache Spark [5] was selected as a framework for processing streaming data. It works much faster than Hadoop, supports cluster mode, and it is compatible with other Apache products.…”

Section: Selection and Review Of Software Toolsmentioning

confidence: 99%

“…In [5] authors compare several methods for detecting anomalies on UNSW-NB15 dataset. They test correlation analysis, linear discriminant analysis and seven well known classification algorithms within the bigdata tool Apache Spark.…”

Section: Introductionmentioning

confidence: 99%

Security event data collection and analysis in large corporate networks

Чернова¹,

Polezhaev

Shukhman

et al. 2019

Proceedings of the v International Conference Information Technology and Nanotechnology 2019

View full text Add to dashboard Cite

Every year computer networks become more complex, which directly affects the provision of a high level of information security. Different commercial services, critical systems, and information resources prevailing in such networks are profitable targets for terrorists, cyber-spies, and criminals. The consequences range from the theft of strategic, highly valued intellectual property and direct financial losses to significant damages to a brand and customer trust. Attackers have the advantage in complex computer networks – it is easier to hide their tracks. The detection and identification of security incidents are the most important and difficult tasks. It is required to detect security incidents as soon as possible, to analyze and respond to them correctly, so as not to complicate the work of the enterprise computer network. The difficulty is that different event sources offer different data formats or can duplicate events. In addition, some events do not indicate any problems on their own, but their sequence may indicate the presence of a security incident. All collection processes of security events must be performed in real-time, which means streaming data processing.

show abstract

Network Intrusion Detection in Big Dataset Using Spark

Cited by 67 publications

References 1 publication

Performance analysis of binary and multiclass models using azure machine learning

Performance analysis of binary and multiclass models using azure machine learning

A deep learning approach with Bayesian optimization and ensemble classifiers for detecting denial of service attacks

Security event data collection and analysis in large corporate networks

Contact Info

Product

Resources

About