Network intrusion detection research work that employed KDDCup 99 dataset often encounter challenges in creating classifiers that could handle unequal distributed attack categories. The accuracy of a classification model could be jeopardized if the distribution of attack categories in a training dataset is heavily imbalanced where the rare categories are less than 2% of the total population. In such cases, the model could not efficiently learn the characteristics of rare categories and this will result in poor detection rates. In this research, we introduce an efficient and effective approach in dealing with the unequal distribution of attack categories. Our approach relies on the training of cascaded classifiers using a dichotomized training dataset in each cascading stage. The training dataset is dichotomized based on the rare and non-rare attack categories. The empirical findings support our arguments that training cascaded classifiers using the dichotomized dataset provides higher detection rates on the rare categories as well as comparably higher detection rates for the non-rare attack categories as compared to the findings reported in other research works. The higher detection rates are due to the mitigation of the influence from the dominant categories if the rare attack categories are separated from the dataset.
Frauds and default payments are two major anomalies in credit card transactions. Researchers have been vigorously finding solutions to tackle them and one of the solutions is to use data mining approaches. However, the collected credit card data can be quite a challenge for researchers. This is because of the data characteristics that contain: (i) unbalanced class distribution, and (ii) overlapping of class samples. Both characteristics generally cause low detection rates for the anomalies that are minorities in the data. On top of that, the weakness of general learning algorithms contributes to the difficulties of classifying the anomalies as the algorithms generally bias towards the majority class samples. In this study, we used a Multiple Classifiers System (MCS) on these two data sets: (i) credit card frauds (CCF), and (ii) credit card default payments (CCDP). The MCS employs a sequential decision combination strategy to produce accurate anomaly detection. Our empirical studies show that the MCS outperforms the existing research, particularly in detecting the anomalies that are minorities in these two credit card data sets. INDEX TERMS Anomaly detection, credit card, multiple classifiers.
Problem statement: Implementing a single or multiple classifiers that involve a Bayesian Network (BN) is a rising research interest in network intrusion detection domain. Approach: However, little attention has been given to evaluate the performance of BN classifiers before they could be implemented in a real system. In this research, we proposed a novel approach to select important features by utilizing two selected feature selection algorithms utilizing filter approach. Results: The selected features were further validated by domain experts where extra features were added into the final proposed feature set. We then constructed three types of BN namely, Naive Bayes Classifiers (NBC), Learned BN and Expert-elicited BN by utilizing a standard network intrusion dataset. The performance of each classifier was recorded. We found that there was no difference in overall performance of the BNs and therefore, concluded that the BNs performed equivalently well in detecting network attacks. Conclusion/Recommendations: The results of the study indicated that the BN built using the proposed feature set has less features but the performance was comparable to BNs built using other feature sets generated by the two algorithms
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.