2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton) 2015
DOI: 10.1109/allerton.2015.7447112
|View full text |Cite
|
Sign up to set email alerts
|

Coded MapReduce

Abstract: MapReduce is a commonly used framework for executing data-intensive tasks on distributed server clusters. We present "Coded MapReduce", a new framework that enables and exploits a particular form of coding to significantly reduce the inter-server communication load of MapReduce. In particular, Coded MapReduce exploits the repetitive mapping of data blocks at different servers to create coded multicasting opportunities in the shuffling phase, cutting down the total communication load by a multiplicative factor … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
141
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 144 publications
(142 citation statements)
references
References 20 publications
(75 reference statements)
0
141
1
Order By: Relevance
“…A preliminary part of this result, in particular the achievability for the special case of s = 1, or the achievable scheme of Theorem 1 was presented in [1]. We note that when s = 1, Theorem 2 provides the same result as in Theorem 1, i.e., L * (r, 1) = Remark 8.…”
Section: Problem Formulationmentioning
confidence: 55%
See 1 more Smart Citation
“…A preliminary part of this result, in particular the achievability for the special case of s = 1, or the achievable scheme of Theorem 1 was presented in [1]. We note that when s = 1, Theorem 2 provides the same result as in Theorem 1, i.e., L * (r, 1) = Remark 8.…”
Section: Problem Formulationmentioning
confidence: 55%
“…Here we do not impose any constraint on how the Map and Reduce functions are chosen (for example, they can be arbitrary linear or nonlinear functions). 1 When mapping a file, we compute Q intermediate values in parallel, one for each of the Q output functions. The main reason to do this is that parallel processing can be efficiently performed for applications that fit into the MapReduce framework.…”
Section: Problem Formulationmentioning
confidence: 99%
“…Coded caching has been studied in many scenarios such as decentralized coded caching [68], online coded caching [69], hierarchical coded caching for wireless communication [70], and device-to-device coded caching [71]. Recently, the authors in [72] proposed coded MapReduce that reduces the communication cost in the process of transferring the results of mappers to reducers.…”
Section: B Data Shuffling and Communication Overheadsmentioning
confidence: 99%
“…The first coding concept introduced in [6]- [8] enables an inverse-linear tradeoff between computation load and communication load in distributed computing. This result implies that increasing the computation load by a factor of r (i.e., evaluating each computation at r carefully chosen nodes) can create novel coding opportunities that reduce the required communication load for computing by the same factor r. Hence, these codes can be utilized to pool the underutilized computing resources at network edge to slash the communication load of Fog computing [9].…”
Section: Introductionmentioning
confidence: 99%
“…Using Lemma 3, the minimum and maximum induced costs for the task of computing r = 100 equations are C m = 314.6 and C M = 2516.8 respectively. It takes 15 iterations for the proposed heuristic search algorithm to arrive at the tuple (n 1 , n 2 , n 3 ) = (10,6,0). This corresponds to the expected cost 486.2 and the expected time E[T HCMM ] = 14.3.…”
mentioning
confidence: 99%