Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication 2015
DOI: 10.1145/2785956.2787484
|View full text |Cite
|
Sign up to set email alerts
|

Congestion Control for Large-Scale RDMA Deployments

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
72
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 258 publications
(72 citation statements)
references
References 26 publications
0
72
0
Order By: Relevance
“…Existing window-based congestion control schemes are not effective for this particular type of workload consisting of very small messages [6] that fit in a single packet, across a large number of connections, in an environment with µs RTTs. Congestion control for such workloads is an open research problem [11,12,31,47] with recent research proposals [33] suggesting a generic congestion control agent that can be used in kernel-bypass networking stacks. Such an approach is a good fit for our networking stack, but we leave this for future work.…”
Section: Methodsmentioning
confidence: 99%
“…Existing window-based congestion control schemes are not effective for this particular type of workload consisting of very small messages [6] that fit in a single packet, across a large number of connections, in an environment with µs RTTs. Congestion control for such workloads is an open research problem [11,12,31,47] with recent research proposals [33] suggesting a generic congestion control agent that can be used in kernel-bypass networking stacks. Such an approach is a good fit for our networking stack, but we leave this for future work.…”
Section: Methodsmentioning
confidence: 99%
“…InfiniBand provides rate-based end-to-end congestion control using ECN marks [10,22]. DCQCN [55] has shown that RoCE without end -to-end congestion control degrades in both latency and throughput at high loads.…”
Section: Challengesmentioning
confidence: 99%
“…Unlike TCP's window-based rate control, RCP's [16] routers iteratively calculate and directly convey the fair-share bandwidth to the senders sharing a link. DCQCN [55] and TIMELY [38] improve end-to-end congestion control at datacenter scales for RDMA (RoCE) and user-level TCP respectively. Both DCQCN and TIMELY directly control the sending rate by pacing the packets sent out of the NIC.…”
Section: Challengesmentioning
confidence: 99%
See 1 more Smart Citation
“…Confining the fabric to a bounded-size unit avoids the emergent safety, performance and monitoring challenges of large-scale fabrics. For example, recent work has shown that scaling RDMA over commodity Ethernet introduces issues of congestion control, dealing with deadlocks and livelocks, and other subtleties of priority-based flow control [29,61].…”
Section: Rackout Data Servingmentioning
confidence: 99%