Symposium on Algorithmic Principles of Computer Systems 2020
DOI: 10.1137/1.9781611976021.3
|View full text |Cite
|
Sign up to set email alerts
|

Memory-Efficient Performance Monitoring on Programmable Switches with Lean Algorithms

Abstract: Network performance problems are notoriously difficult to diagnose. Prior profiling systems collect performance statistics by keeping information about each network flow, but maintaining per-flow state is not scalable on resource-constrained NIC and switch hardware. Instead, we propose sketch-based performance monitoring using memory that is sublinear in the number of flows. Existing sketches estimate flow monitoring metrics based on flow sizes. In contrast, performance monitoring typically requires combining … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 14 publications
(10 citation statements)
references
References 30 publications
(47 reference statements)
0
10
0
Order By: Relevance
“…Based on the inferred variables, Dapper can identify the root cause of the bottleneck. Similarly, the authors in [89] monitored conditions such as retransmissions, packet loss, round-trip-time, out-of-order packets to identify the top-k problematic flows. Furthermore, Blink detects failures based on the predictable behavior of TCP, which retransmits packets at epochs exponentially spaced in time, in the presence of failure.…”
Section: ) Measurements Schemes Comparison Discussion and Limitationsmentioning
confidence: 99%
See 1 more Smart Citation
“…Based on the inferred variables, Dapper can identify the root cause of the bottleneck. Similarly, the authors in [89] monitored conditions such as retransmissions, packet loss, round-trip-time, out-of-order packets to identify the top-k problematic flows. Furthermore, Blink detects failures based on the predictable behavior of TCP, which retransmits packets at epochs exponentially spaced in time, in the presence of failure.…”
Section: ) Measurements Schemes Comparison Discussion and Limitationsmentioning
confidence: 99%
“…The key idea is to have every switch maintain fine-grained telemetry data for a short period of time, and upon detecting performance degradation (e.g., increased delay), the information is offloaded to a collector. Liu et al [81] proposed a memory-efficient approach for network performance monitoring. This solution only monitors the top-k problematic flows.…”
Section: B2 Literature Reviewmentioning
confidence: 99%
“…Applications that are sensitive to memory size would be affected or even infeasible. For example, load balancing [28] lapses into slower, accuracy of sketching or monitoring applications declines, and even network diagnosis [8] that relies on per-flow or per-packet monitoring would be infeasible if the number of con-nections is large [117]. Hence, most of the applications need to make tradeoffs between performance and memory usage.…”
Section: Memory Capacitymentioning
confidence: 99%
“…Instead, as shown in Table 2, Joltik complements these solutions by providing a generalized analytics framework that does not rely on assumptions about the sensed data and can operate in a centralized manner. Sketching for Data Analytics: Sketching algorithms for aggregate statistics have been explored in various contexts, including stream data processing [4,10,18,44,51], database [16,19,25] and network telemetry [36,43,45,46,72,75]. Perhaps the closest related work is UnivMon [45], which enables real-time general network telemetry.…”
Section: Related Workmentioning
confidence: 99%