2013
DOI: 10.1007/978-3-642-41527-2_2
|View full text |Cite
|
Sign up to set email alerts
|

When Distributed Computation Is Communication Expensive

Abstract: We consider a number of fundamental statistical and graph problems in the message-passing model, where we have k machines (sites), each holding a piece of data, and the machines want to jointly solve a problem defined on the union of the k data sets. The communication is point-to-point, and the goal is to minimize the total communication among the k machines. This model captures all point-to-point distributed computational models with respect to minimizing communication costs. Our analysis shows that exact com… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
27
0

Year Published

2014
2014
2022
2022

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 35 publications
(27 citation statements)
references
References 25 publications
0
27
0
Order By: Relevance
“…Notice that in the Congested Clique model the input graph G is tightly coupled with the communication network N and the graph is distributed among the machines via a vertex partition. This is not the case in other related models for distributed graph processing, such as [1,30,19]. In these papers the input graph can be much larger than the machine network and the distribution of the graph among machines is via an edge partition.…”
Section: The Modelmentioning
confidence: 94%
See 1 more Smart Citation
“…Notice that in the Congested Clique model the input graph G is tightly coupled with the communication network N and the graph is distributed among the machines via a vertex partition. This is not the case in other related models for distributed graph processing, such as [1,30,19]. In these papers the input graph can be much larger than the machine network and the distribution of the graph among machines is via an edge partition.…”
Section: The Modelmentioning
confidence: 94%
“…In [19] this edge partition is assumed to be random (initially), in [30] the edge partition can be worst case, whereas in [1] the edge partition is worst case, but with the requirement that each processor has the same number of edges. It is worth noting that [30] does prove message complexity lower bound for problems such as GC, but these lower bounds make crucial use of the worst case distribution of edges and do not apply in our model. Similarly, the lower bounds in the setting of [1] do not seem to directly apply in the Congested Clique model.…”
Section: The Modelmentioning
confidence: 99%
“…In distributed computing, the total amount of communication is often the most relevant complexity measure. For example, Woodruff and Zhang [44] and Klauck et al [25] identify models and problems for which there is no algorithm that beats the communication benchmark of sending the entire input to a single machine. Because massively parallel systems are designed to send a potentially large amount of data in a single round, such communication lower bounds do not generally imply lower bounds for round complexity.…”
Section: Related Workmentioning
confidence: 99%
“…The coordinator model has attracted a lot of attentions in recent years [1,25,41,45]. In the high level, it is similar to the congested clique model [16,35,36,40] and the k-machine model [32].…”
Section: Introductionmentioning
confidence: 99%
“…The k sites would like to jointly compute some statistical function f defined on S by treating items from the same group as the same item. For example, the distinct elements function is defined to be F0(S) = |G| = n. We always allow a (1 + )-approximation since for exact computation, in the worst case, there is often no better way than shipping all items to one site (for many statistical problems, this holds even in the noise-free case, see [45]). The precise meaning of the (1 + )-approximation depends on specific problems.…”
Section: Introductionmentioning
confidence: 99%