Abstract-In this paper we focus on passive measurements of TCP traffic, main component of nowadays traffic. We propose a heuristic technique for the classification of the anomalies that may occur during the lifetime of a TCP flow, such as out-ofsequence and duplicate segments. Since TCP is a closed-loop protocol that infers network conditions by means of losses and reacts accordingly, the possibility of carefully distinguishing the causes of anomalies in TCP traffic is very appealing, since it may be instrumental to the deep understanding of TCP behavior in real environments and to protocol engineering as well. We apply the proposed heuristic to traffic traces collected at both networks edges and backbone links. By studying the statistical properties of TCP anomalies, we find that their aggregate exhibits Long Range Dependence phenomena, but that anomalies suffered by individual long-lived flows are on the contrary uncorrelated. Interestingly, no dependence to the actual link load is observed.
I. INTRODUCTIONIn the last ten years, the interest in data collection, measurement and analysis to characterize Internet traffic behavior increased steadily. Indeed, by acknowledging the failure of traditional modeling paradigms, the research community focused on the analysis of the traffic characteristics with the twofold objective of i) understanding the dynamics of traffic and its impact on the network elements and ii) finding simple, yet satisfactory, models for the design and planning of packetswitched data networks, like the Erlang teletraffic theory in telephone networks.By focusing on passive traffic characterization, we face the task of measuring Internet traffic, which is particularly daunting for a number of reasons. First, traffic analysis is made very difficult by the correlations both in space and time, which is due to the closed-loop behavior of TCP, the TCP/IP clientserver communication paradigm, and the fact that the highly variable quality provided to the end-user influences her/his behavior. Second, the complexity of the involved protocols, e.g. TCP itself, is such that a number of phenomena can be studied only if a deep knowledge of the protocol details is exploited. Finally, some of the traffic dynamics can be understood only if the forward and backward directions of flows are jointly analyzed -which is especially true for the detection of erratic flows behavior. Starting from [1], where a simple but efficient classification algorithm for out-ofsequence TCP segments is presented, this work aims at identifying and analyzing a larger subset of phenomena, including, e.g., network duplicates, unneeded retransmissions and flow control mechanisms triggered or suffered by TCP flows.The proposed classification technique has been applied to a set of real traces collected at different measurement points.