Due to the lack of adequate public datasets, the proponents of many existing cloud intrusion detection systems (IDS) have relied on the DARPA dataset to design and evaluate their models. In the current paper, we show empirically that the DARPA dataset by failing to meet important statistical characteristics of real world cloud traffic data center is inadequate for evaluating cloud IDS. We present, as alternative, a new public dataset collected through a cooperation between our lab and a non-profit cloud service provider, which contains benign data and a wide variety of attack data. We present a new hypervisor-based cloud IDS using instanceoriented feature model and supervised machine learning techniques. We investigate 3 different classifiers: Logistic Regression (LR), Random Forest (RF), and Support Vector Machine (SVM) algorithms. Experimental evaluation on a diversified dataset yields a detection rate of 92.08% and a false positive rate of 1.49% for random forest, the best performing of the three classifiers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.