2016 IEEE International Symposium on Information Theory (ISIT) 2016
DOI: 10.1109/isit.2016.7541478
|View full text |Cite
|
Sign up to set email alerts
|

Speeding up distributed machine learning using codes

Abstract: Codes are widely used in many engineering applications to offer robustness against noise. In large-scale systems there are several types of noise that can affect the performance of distributed machine learning algorithms -straggler nodes, system failures, or communication bottlenecks -but there has been little interaction cutting across codes, machine learning, and distributed systems. In this work, we provide theoretical insights on how coded solutions can achieve significant gains compared to uncoded ones. W… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

4
813
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
4
2

Relationship

0
10

Authors

Journals

citations
Cited by 317 publications
(817 citation statements)
references
References 30 publications
(26 reference statements)
4
813
0
Order By: Relevance
“…There is also an emerging body of work on using replication or erasure coding to mitigate stragglers in linear computations, such as matrix-vector multiplication (Dutta et al 2016;Lee et al 2016;Mallick et al 2018) and matrix-matrix multiplication (Yang et al 2017;Yu et al 2017), and machine learning (Ferdinand and Draper 2016;Tandon et al 2017). Our work is for general (possibly nonlinear) computations for which coding techniques cannot be directly applied, and we have to resort to simpler task replication strategies.…”
Section: Related Prior Workmentioning
confidence: 99%
“…There is also an emerging body of work on using replication or erasure coding to mitigate stragglers in linear computations, such as matrix-vector multiplication (Dutta et al 2016;Lee et al 2016;Mallick et al 2018) and matrix-matrix multiplication (Yang et al 2017;Yu et al 2017), and machine learning (Ferdinand and Draper 2016;Tandon et al 2017). Our work is for general (possibly nonlinear) computations for which coding techniques cannot be directly applied, and we have to resort to simpler task replication strategies.…”
Section: Related Prior Workmentioning
confidence: 99%
“…A wealth of straggler avoidance techniques have been proposed in recent years for DGD as well as other distributed computation tasks [ 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 ]. The common design notion behind all these schemes is the assignment of redundant computations/tasks to workers, such that faster workers can compensate for the stragglers.…”
Section: Introductionmentioning
confidence: 99%
“…The broadcast rate or the rate of an index code is the ratio of the code length to the length of each of the messages. The problem of designing index codes with smallest possible broadcast rate is significant because of its applications, such as multimedia content delivery [3], coded caching [4], distributed computation [5], and also because of its relation to network coding [6], [7] and coding for distributed storage [8], [9].…”
Section: Introductionmentioning
confidence: 99%