A matrix giving the traffic volumes between origin and destination in a network has tremendously potential utility for network capacity planning and management. Unfortunately, traffic matrices are generally unavailable in large operational IP networks. On the other hand, link load measurements are readily available in IP networks. In this paper, we propose a new method for practical and rapid inference of traffic matrices in IP networks from link load measurements, augmented by readily available network and routing configuration information. We apply and validate the method by computing backbonerouter to backbone-router traffic matrices on a large operational tier-1 IP network -a problem an order of magnitude larger than any other comparable method has tackled. The results show that the method is remarkably fast and accurate, delivering the traffic matrix in under five seconds.
Measurement and estimation of packet loss characteristics are challenging due to the relatively rare occurrence and typically short duration of packet loss episodes. While active probe tools are commonly used to measure packet loss on end-to-end paths, there has been little analysis of the accuracy of these tools or their impact on the network. The objective of our study is to understand how to measure packet loss episodes accurately with end-to-end probes. We begin by testing the capability of standard Poisson-modulated end-to-end measurements of loss in a controlled laboratory environment using IP routers and commodity end hosts. Our tests show that loss characteristics reported from such Poisson-modulated probe tools can be quite inaccurate over a range of traffic conditions. Motivated by these observations, we introduce a new algorithm for packet loss measurement that is designed to overcome the deficiencies in standard Poisson-based tools. Specifically, our method creates a probe process that (1) enables an explicit trade-off between accuracy and impact on the network, and (2) enables more accurate measurements than standard Poisson probing at the same rate. We evaluate the capabilities of our methodology experimentally by developing and implementing a prototype tool, called BADABING. The experiments demonstrate the trade-offs between impact on the network and measurement accuracy. We show that BADABING reports loss characteristics far more accurately than traditional loss measurement tools.
Many network management applications use as their data traffic volumes differentiated by attributes such as IP address or port number. IP flow records are commonly collected for this purpose: these enable determination of fine-grained usage of network resources. However, the increasingly large volumes of flow statistics incur concomitant costs in the resources of the measurement infrastructure. This motivates sampling of flow records.This paper addresses sampling strategy for flow records. Recent work has shown that non-uniform sampling is necessary in order to control estimation variance arising from the observed heavy-tailed distribution of flow lengths. However, while this approach controls estimator variance, it does not place hard limits on the number of flows sampled. Such limits are often required during arbitrary downstream sampling, resampling and aggregation operations employed in analysis of the data.This paper proposes a correlated sampling strategy that is able to select an arbitrarily small number of the "best" representatives of a set of flows. We show that usage estimates arising from such selection are unbiased, and show how to estimate their variance, both offline for modeling purposes, and online during the sampling itself. The selection algorithm can be implemented in a queue-like data structure in which memory usage is uniformly bounded during measurement. Finally, we compare the complexity and performance of our scheme with other potential approaches.
Service level agreements (SLAs) define performance guarantees made by service providers, e.g, in terms of packet loss, delay, delay variation, and network availability. In this paper, we describe a new active measurement methodology to accurately monitor whether measured network path characteristics are in compliance with performance targets specified in SLAs. Specifically, (1) we describe a new methodology for estimating packet loss rate that significantly improves accuracy over existing approaches; (2) we introduce a new methodology for measuring mean delay along a path that improves accuracy over existing methodologies, and propose a method for obtaining confidence intervals on quantiles of the empirical delay distribution without making any assumption about the true distribution of delay; (3) we introduce a new methodology for measuring delay variation that is more robust than prior techniques; and (4) we extend existing work in network performance tomography to infer lower bounds on the quantiles of a distribution of performance measures along an unmeasured path given measurements from a subset of paths. We unify active measurements for these metrics in a discrete time-based tool called SLAM. The unified probe stream from SLAM consumes lower overall bandwidth than if individual streams are used to measure path properties. We demonstrate the accuracy and convergence properties of SLAM in a controlled laboratory environment using a range of background traffic scenarios and in one-and two-hop settings, and examine its accuracy improvements over existing standard techniques.
Sampling is crucial for controlling resource consumption by internet traffic flow measurements. Routers use Packet Sampled NetFlow, and completed flow records are sampled in the measurement infrastructure. Recent research, motivated by the need of service providers to accurately measure both small and large traffic subpopulations, has focused on distributing a packet sampling budget amongst subpopulations. But long timescales of hardware development and lower bandwidth costs motivate post-measurement analysis of complete flow records at collectors instead. Sampling in collector databases then manages data volumes, yielding general purpose summaries that are rapidly queried to trigger drill-down analysis on a time limited window of full data. These are sufficiently small to be archived. This paper addresses the problem of distributing a sampling budget over subpopulations of flow records. Estimation accuracy goals are met by fairly sharing the budget. We establish a correspondence between the type of accuracy goal, and the flavor of fair sharing used. A streaming Max-Min Fair Sampling algorithm fairly shares the sampling budget across subpopulations, with sampling as a mechanism to deallocate budget. This provides timely samples and is robust against uncertainties in configuration and demand. We illustrate using flow records from an access router of a large ISP, where rates over interface traffic subpopulations vary over several orders of magnitude. We detail an implementation whose computational cost is no worse than subpopulation-oblivious sampling.
Measurement, collection, and interpretation of network usage data commonly involves multiple stage of sampling and aggregation. Examples include sampling packets, aggregating them into flow statistics at a router, sampling and aggregation of usage records in a network data repository for reporting, query and archiving. Although unbiased estimates of packet, bytes and flows usage can be formed for each sampling operation, for many applications it is crucial to know the inherent estimation error. Previous work in this area has been limited mainly to analyzing the estimator variance for particular methods, e.g., independent packet sampling. However, the variance is of limited use for more general sampling methods, where the estimate may not be well approximated by a Gaussian distribution. This motivates our paper, in which we establish Chernoff bounds on the likelihood of estimation error in a general multistage combination of measurement sampling and aggregation. We derive the scale against which errors are measured, in terms of the constituent sampling and aggregation operations. In particular this enables us to obtain rigorous confidence intervals around any given estimate. We apply our method to a number of sampling schemes both in the literature and currently deployed, including sampling of packet sampled NetFlow records, Sample and Hold, and Flow Slicing. We obtain one particularly striking result in the first case: that for a range of parameterizations, packet sampling has no additional impact on the estimator confidence derived from our bound, beyond that already imposed by flow sampling.
Random sampling has been proven time and time again to be a powerful tool for working with large data. Queries over the full dataset are replaced by approximate queries over the smaller (and hence easier to store and manipulate) sample. The sample constitutes a flexible summary that supports a wide class of queries. But in many applications, datasets are modified with time, and it is desirable to update samples without requiring access to the full underlying datasets. In this paper, we introduce and analyze novel techniques for sampling over dynamic data, modeled as a stream of modifications to weights associated with each key. While sampling schemes designed for stream applications can often readily accommodate positive updates to the dataset, much less is known for the case of negative updates, where weights are reduced or items deleted altogether. We primarily consider the turnstile model of streams, and extend classic schemes to incorporate negative updates. Perhaps surprisingly, the modifications to handle negative updates turn out to be natural and seamless extensions of the well-known positive update-only algorithms. We show that they produce unbiased estimators, and we relate their performance to the behavior of corresponding algorithms on insert-only streams with different parameters. A careful analysis is necessitated, in order to account for the fact that sampling choices for one key now depend on the choices made for other keys. In practice, our solutions turn out to be efficient and accurate. Compared to recent algorithms for L p sampling which can be applied to this problem, they are significantly more reliable, and dramatically faster.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with đź’™ for researchers
Part of the Research Solutions Family.