Abstract. The identification of application flows is a critical task in order to manage bandwidth requirements of different kind of services (i.e. VOIP, Video, ERP). As network security functions spread, an increasing amount of traffic is natively encrypted due to privacy issues (e.g. VPN). This makes ineffective current traffic classification systems based on ports and payload inspection, e.g. even powerful Deep Packet Inspection is useless to classify application flow carried inside SSH sessions. We have developed a real time traffic classification method based on cluster analysis to identify SSH flows from statistical behavior of IP traffic parameters, such as length, arrival times and direction of packets. In this paper we describe our approach and relevant obtained results. We achieve detection rate up to 99.5 % in classifying SSH flows and accuracy up to 99.88 % for application flows carried within those flows, such as SCP, SFTP and HTTP over SSH.
Application level traffic classification has been addressed in demonstrated recently based on statistical features of packet flows. Among the most significant characteristics is packet length. Even ciphered flows leak information about their content through the sequence of packet length values. There are obvious ways to destroy such side information, e.g. by setting all packet at maximum allowed length. This approach could ential an extremely large overhead, which makes it impractical. There is room to investigate the optimal trade-off between overhead/complexity of packet length masking and suppression of information leakage about flow content through packet length values. In this work we characterize the optimum first order statistical padding technique which guarantees indistinguishability of different application flows. We also discuss how to account for subsequent packet length correlation. Numerical results are shown with reference to real network traffic traces, specifically flows of HTTP, POP3, SSH, and FTP (control session) traffic
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.