Proceedings of the 15th ACM Workshop on Hot Topics in Networks 2016
DOI: 10.1145/3005745.3005768
|View full text |Cite
|
Sign up to set email alerts
|

Unlocking Credit Loop Deadlocks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 21 publications
(9 citation statements)
references
References 10 publications
0
9
0
Order By: Relevance
“…This is primarily because HPC clusters are smaller with more controlled traffic patterns, and hence the negative effects of providing losslessness (such as congestion spreading and deadlocks) are rarer. PFC's issues are exacerbated on larger scale clusters [23,24,29,34,37]. Credit-based Flow Control: Since the focus of our work was RDMA deployment over Ethernet, our experiments used PFC.…”
Section: Discussion and Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…This is primarily because HPC clusters are smaller with more controlled traffic patterns, and hence the negative effects of providing losslessness (such as congestion spreading and deadlocks) are rarer. PFC's issues are exacerbated on larger scale clusters [23,24,29,34,37]. Credit-based Flow Control: Since the focus of our work was RDMA deployment over Ethernet, our experiments used PFC.…”
Section: Discussion and Related Workmentioning
confidence: 99%
“…However, the current solution is not without problems. In particular, PFC adds management complexity and can lead to significant performance problems such as head-of-the-line blocking, congestion spreading, and occasional deadlocks [23,24,34,36,37]. Rather than continue down the current path and address the various problems with PFC, in this paper we take a step back and ask whether it was needed in the first place.…”
Section: Introductionmentioning
confidence: 99%
“…As expected, Figure 6 shows only a constant amount of data traverses the network when using RMCs to compute hashes at the server. Prior work [27,35,48] has discussed many of the challenges of using RDMA in congested networks. In this case, the use of RMCs reduces the amount of bulk data transferred over the network fabric, thereby reducing congestion.…”
Section: Hashing a Remote Buffermentioning
confidence: 99%
“…PFC pauses all upstream interfaces once it detects a risk of packet loss, and the pauses can propagate via a tree-like graph to multiple hops away. Such spreading of congestion can possibly trigger PFC deadlocks [21,23,38] and PFC storms (Case-1 in §1) that can silence a lot of senders even if the network has free capacity. Despite the probability of PFC deadlocks and storms being fairly small, they are still big threats to operators and applications, since currently we have no methods to guarantee they won't occur [23].…”
Section: Our Goals For Rdmamentioning
confidence: 99%