The classification of IP ows according to the application that generated them is at the basis of any modern network management platform. However, classical techniques such as the ones based on the analysis of transport layer or application layer information are rapidly becoming ineffective. In this paper we present a ow classification mechanism based on three simple properties of the captured IP packets: their size, inter-arrival time and arrival order. Even though these quantities have already been used in the past to define classification techniques, our contribution is based on new structures called protocol fingerprints, which express such quantities in a compact and efficient way, and on a simple classification algorithm based on normalized thresholds. Although at a very early stage of development, the proposed technique is showing promising preliminary results from the classification of a reduced set of protocols.
Abstract-Correct classification of traffic flows according to the application layer protocols that generated them is essential for most network-management, resource allocation and intrusion detection systems in TCP/IP networks. With the ever increasing number of network protocols and services running on nonstandard TCP ports, the classification methods based the analysis of the transport layer header are rapidly becoming ineffective. On the other hand, mechanisms based on full payload analysis are too computationally demanding to be run on most highbandwidth links. Here we present a novel classification technique based on the statistical analysis of network traffic performed at the IP-level. The key idea behind our approach is to build a set of protocol fingerprints that we believe summarize, in a compact and efficient way, the main IP-level statistical properties of application layer protocols. By means of a simple, lightweight algorithm based on the notion of anomaly scores, also presented in this paper, an unknown flow can be compared against known protocol fingerprints, detecting the application that generated the flow. Our methodology is completely based on IP-level analysis: no payload analysis or port analysis is required for the classification of an unknown flow. Besides introducing our approach, we describe preliminary experimental results that show how this technique is effective in correctly classifying network traffic in a real network environment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.