Data mining is an interdisciplinary subfield of computer science involving methods at the intersection of artificial intelligence, machine learning and statistics. One of the data mining tasks is anomaly detection which is the analysis of large quantities of data to identify items, events or observations which do not conform to an expected pattern. Anomaly detection is applicable in a variety of domains, e.g., fraud detection, fault detection, system health monitoring but this article focuses on application of anomaly detection in the field of network intrusion detection.The main goal of the article is to prove that an entropy-based approach is suitable to detect modern botnet-like malware based on anomalous patterns in network. This aim is achieved by realization of the following points: (i) preparation of a concept of original entropy-based network anomaly detection method, (ii) implementation of the method, (iii) preparation of original dataset, (iv) evaluation of the method.
Abstract. Entropy-based anomaly detection has recently been extensively studied in order to overcome weaknesses of traditional volume and rule based approaches to network flows analysis. From many entropy measures only Shannon, Titchener and parameterized Renyi and Tsallis entropies have been applied to network anomaly detection. In the paper, our method based on parameterized entropy and supervised learning is presented. With this method we are able to detect a broad spectrum of anomalies with low false positive rate. In addition, we provide information revealing the anomaly type. The experimental results suggest that our method performs better than Shannon-based and volume-based approach.Keywords: anomaly detection, entropy, netflow, network traffic measurement
IntroductionThe number of anomalies in IP networks caused by wormlike activities is growing [2]. Widely used security solutions based on signatures or rules like firewalls, antiviruses and intrusion detection systems do not provide sufficient protection because they do not cope with evasion techniques and not known yet (0-day) attacks [12], [13]. Therefore, network anomaly detection as one of possible solutions is becoming an essential area of research. Anomaly detection is an identification of observations which do not conform to an expected behavior. In a supervised anomaly detection a labeled data set that involves training a classifier is required.There are many problems with anomaly detectors which have to be addressed. The main challenge is setting up a precise boundary between normal and anomalous behavior to avoid high false positive error rate or low detection rate. Another problems are long computation time, anomaly details extraction and root-cause identification [7]. In our previous work [4], some generalizations of entropy were described in details and preliminary results of using parameterized entropies were presented. In this paper, we make two major contributions. Firstly, we present our method and results in comparison with Shannon-based and volume-based approach. Secondly, we describe data set as well as the method we used to generate anomalies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.