Efficient Redundancy Techniques for Latency Reduction in Cloud Systems

Joshi, Gauri; Soljanin, Emina; Wornell, Gregory W.

doi:10.1145/3055281

Cited by 122 publications

(110 citation statements)

References 36 publications

Supporting

Mentioning

107

Contrasting

Order By: Relevance

“…For the MDS and replication strategies, we reduce the queueing system to a fork-join queueing system with redundancy, and then use previous results [32,35] to obtain bounds on the mean response time. The results are presented in Appendix D.…”

Section: Queueing Analysismentioning

confidence: 99%

Rateless Codes for Near-Perfect Load Balancing in Distributed Matrix-Vector Multiplication

Mallick

Chaudhari

Sheth³

et al. 2019

Proc. ACM Meas. Anal. Comput. Syst.

Self Cite

View full text Add to dashboard Cite

Large-scale machine learning and data mining applications require computer systems to perform massive matrix-vector and matrix-matrix multiplication operations that need to be parallelized across multiple nodes. The presence of straggling nodes -computing nodes that unpredictably slowdown or fail -is a major bottleneck in such distributed computations. Ideal load balancing strategies that dynamically allocate more tasks to faster nodes require knowledge or monitoring of node speeds as well as the ability to quickly move data. Recently proposed fixed-rate erasure coding strategies can handle unpredictable node slowdown, but they ignore partial work done by straggling nodes thus resulting in a lot of redundant computation. We propose a rateless fountain coding strategy that achieves the best of both worlds -we prove that its latency is asymptotically equal to ideal load balancing, and it performs asymptotically zero redundant computations. Our idea is to create linear combinations of the m rows of the matrix and assign these encoded rows to different worker nodes. The original matrix-vector product can be decoded as soon as slightly more than m row-vector products are collectively finished by the nodes. We conduct experiments in three computing environments: local parallel computing, Amazon EC2, and Amazon Lambda, which show that rateless coding gives as much as 3× speed-up over uncoded schemes.

show abstract

Section: Queueing Analysismentioning

confidence: 99%

Rateless Codes for Near-Perfect Load Balancing in Distributed Matrix-Vector Multiplication

Mallick

Chaudhari

Sheth³

et al. 2019

Proc. ACM Meas. Anal. Comput. Syst.

Self Cite

View full text Add to dashboard Cite

show abstract

“…Queuing systems with redundancy, on the other hand, are studied in literature, e.g. [3], [11], [21], [25]. With redundancy, two scenarios for the cancellation of redundant copies of a jobs have been studied; cancellation after the first copy starts service, [3], [11], and cancellation after the first copy finishes service, [21], [25].…”

Section: Problem Statementmentioning

confidence: 99%

“…Two scenarios have been proposed for treating redundancy in distributed systems. In the first scenario, [3] and [11], copies of an arriving job are submitted to multiple servers and the redundant copies get cancelled once the first copy starts service. The first copy of a job starting the service is the one which faces the least-work-left queue among all the copies.…”

Section: Introductionmentioning

confidence: 99%

Redundancy Scheduling in Systems with Bi-Modal Job Service Time Distributions

Behrouzi-Far

Soljanin

2019

2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

Self Cite

View full text Add to dashboard Cite

Queuing systems with redundant requests have drawn great attention because of their promise to reduce the job completion time and variability. Despite a large body of work on the topic, we are still far from fully understanding the benefits of redundancy in practice. We here take one step towards practical systems by studying queuing systems with bi-modal job service time distribution. Such distributions have been observed in practice, as can be seen in, e.g., Google cluster traces. We develop an analogy to a classical urns and balls problem, and use it to study the queuing time performance of two non-adaptive classical scheduling policies: random and round-robin. We introduce new performance indicators in the analogous model, and argue that they are good predictors of the queuing time in non-adaptive scheduling policies. We then propose a non-adaptive scheduling policy that is based on combinatorial designs, and show that it has better performance indicators. Simulations confirm that the proposed scheduling policy, as the performance indicators suggest, reduces the queuing times compared to random and round-robin scheduling.

show abstract

“…If, at demand λ, there exists a splitting strategy under which no storage system node receives requests at a rate in excess of its service rate, then λ is said to be in the achievable service rate region of the storage system. More formally, the storage system's achievable service rate region S is the set of all λ ∈ R K ≥0 such that there exists a splitting strategy with (4) For any λ = (λ1, . .…”

Section: Preliminariesmentioning

confidence: 99%

Service Rate Region of Content Access from Erasure Coded Storage

Anderson

Johnston

Joshi

et al. 2018

2018 IEEE Information Theory Workshop (ITW)

Self Cite

View full text Add to dashboard Cite

We consider storage systems in which K files are stored over N nodes. A node may be systematic for a particular file in the sense that access to it gives access to the file. Alternatively, a node may be coded, meaning that it gives access to a particular file only when combined with other nodes (which may be coded or systematic). Requests for file f k arrive at rate λ k , and we are interested in the rate that can be served by a particular system. In this paper, we determine the set of request arrival rates for the a 3-file coded storage system. We also provide an algorithm to maximize the rate of requests served for file K given λ1, . . . , λK−1 in a general K-file case.

show abstract

Efficient Redundancy Techniques for Latency Reduction in Cloud Systems

Cited by 122 publications

References 36 publications

Rateless Codes for Near-Perfect Load Balancing in Distributed Matrix-Vector Multiplication

Rateless Codes for Near-Perfect Load Balancing in Distributed Matrix-Vector Multiplication

Redundancy Scheduling in Systems with Bi-Modal Job Service Time Distributions

Service Rate Region of Content Access from Erasure Coded Storage

Contact Info

Product

Resources

About