Sikha Bagui scite author profile

Sikha Bagui

5Publications

149Citation Statements Received

58Citation Statements Given

How they've been cited

383

149

How they cite others

Affiliations

University of West Florida, Marcus (United States)

Publications

Order By: Most citations

Resampling imbalanced data for network intrusion detection datasets

Bagui

2021

J Big Data

156

View full text Add to dashboard Cite

Machine learning plays an increasingly significant role in the building of Network Intrusion Detection Systems. However, machine learning models trained with imbalanced cybersecurity data cannot recognize minority data, hence attacks, effectively. One way to address this issue is to use resampling, which adjusts the ratio between the different classes, making the data more balanced. This research looks at resampling’s influence on the performance of Artificial Neural Network multi-class classifiers. The resampling methods, random undersampling, random oversampling, random undersampling and random oversampling, random undersampling with Synthetic Minority Oversampling Technique, and random undersampling with Adaptive Synthetic Sampling Method were used on benchmark Cybersecurity datasets, KDD99, UNSW-NB15, UNSW-NB17 and UNSW-NB18. Macro precision, macro recall, macro F1-score were used to evaluate the results. The patterns found were: First, oversampling increases the training time and undersampling decreases the training time; second, if the data is extremely imbalanced, both oversampling and undersampling increase recall significantly; third, if the data is not extremely imbalanced, resampling will not have much of an impact; fourth, with resampling, mostly oversampling, more of the minority data (attacks) were detected.

show abstract

Breast cancer detection using rank nearest neighbor classification rules

Bagui

Pal²,

Pal

2003

Pattern Recognition

View full text Add to dashboard Cite

Using machine learning techniques to identify rare cyber‐attacks on the UNSW‐NB15 dataset

Bagui

Kalaimannan

Nandi

et al. 2019

Security and Privacy

View full text Add to dashboard Cite

This paper uses a hybrid feature selection process and classification techniques to classify cyber‐attacks in the UNSW‐NB15 dataset. A combination of k‐means clustering, and a correlation‐based feature selection, were used to come up with an optimum subset of features and then two classification techniques, one probabilistic, Naïve Bayes (NB), and a second, based on decision trees (J48), were employed. Our results show that this hybrid feature selection method in combination with the NB model was able to improve the classification accuracy of most attacks, especially the rare attacks. The false alarm rates were lower for most of the attacks, and particularly the rare attacks, with this combination of feature selection and the NB model. The J48 decision tree model, however, did not perform any better with the feature selection, but its classification rate for all attack families was already very high, with or without feature selection.

show abstract

Comparison of machine-learning algorithms for classification of VPN network traffic flow using time-related features

Fang

Kalaimannan

Bagui

et al. 2017

Journal of Cyber Security Technology

View full text Add to dashboard Cite

Network traffic classification and characterisation is playing an increasingly vital role in understanding and solving securityrelated issues in internet-based applications. The priority of research studies in this area has focused on characterisation of network traffic based on various layers of communication protocols as outlined in the TCP/IP stack and even further expanded to concentrate on specific application-layer protocols. Virtual Private Networks (VPNs) have become one of the most popular remote access communication methods among users over the public internet and other Internet Protocol (IP)-based networks. VPNs are governed by IP Security, which is a suite of protocols used for tunnelling the already encrypted IP traffic, to guarantee secure remote access to servers. In this paper, we propose and develop a framework to classify VPN or non-VPN network traffic using timerelated features. Our focus is on classification of network traffic which is encrypted, tunnelled through a VPN, and the one which is normally encrypted (non-VPN transmission), using machine-learning techniques on data sets of time-related features. Six classification models: logistic regression, support vector machine, Naïve Bayes, k-nearest neighbour and ensemble methodsthe Random Forest (RF) classifier and Gradient Boosting Tree (GBT) classifiersare compared, and recommendations of optimised RF and GBT models over other models are provided in terms of high accuracy and low overfitting. Features which contributed to achieve 90% accuracy in each category were also identified.

show abstract

A multistage generalization of the rank nearest neighbor classification rule

Bagui

Pal

1995

Pattern Recognition Letters

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Sikha Bagui

Resampling imbalanced data for network intrusion detection datasets

Breast cancer detection using rank nearest neighbor classification rules

Using machine learning techniques to identify rare cyber‐attacks on the UNSW‐NB15 dataset

Comparison of machine-learning algorithms for classification of VPN network traffic flow using time-related features

A multistage generalization of the rank nearest neighbor classification rule

Contact Info

Product

Resources

About