We study two fundamental graph problems, Graph Connectivity (GC) and Minimum Spanning Tree (MST), in the well-studied Congested Clique model, and present several new bounds on the time and message complexities of randomized algorithms for these problems. No non-trivial (i.e., super-constant) time lower bounds are known for either of the aforementioned problems; in particular, an important open question is whether or not constant-round algorithms exist for these problems. We make progress toward answering this question by presenting randomized Monte Carlo algorithms for both problems that run in O(log log log n) rounds (where n is the size of the clique). Our results improve by an exponential factor on the long-standing (deterministic) time bound of O(log log n) rounds for these problems due to Lotker et al. (SICOMP 2005). Our algorithms make use of several algorithmic tools including graph sketching, random sampling, and fast sorting.The second contribution of this paper is to present several almosttight bounds on the message complexity of these problems. Specifically, we show that Ω(n 2 ) messages are needed by any algorithm (including randomized Monte Carlo algorithms, and regardless of the number of rounds) that solves the GC (and hence also the MST) problem if each machine in the Congested Clique has initial knowledge only of itself (the so-called KT0 model). In contrast, if the machines have initial knowledge of their neighbors' IDs (the so-called KT1 model), we present a randomized Monte Carlo algorithm for MST that uses O(n polylog n) messages and runs in O(polylog n) rounds. To complement this, we also present a lower bound in the KT1 model that shows that Ω(n) messages are required by any al- * gorithm that solves GC, regardless of the number of rounds used. Our results are a step toward understanding the power of randomization in the Congested Clique with respect to both time and message complexity.
Motivated by the increasing need to understand the distributed algorithmic foundations of large-scale graph computations, we study some fundamental graph problems in a messagepassing model for distributed computing where k 2 machines jointly perform computations on graphs with n nodes (typically, n k). The input graph is assumed to be initially randomly partitioned among the k machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication rounds of the computation.Our main contribution is the General Lower Bound Theorem, a theorem that can be used to show non-trivial lower bounds on the round complexity of distributed large-scale data computations. The General Lower Bound Theorem is established via an information-theoretic approach that relates the round complexity to the minimal amount of information required by machines to solve the problem. Our approach is generic and this theorem can be used in a "cookbook" fashion to show distributed lower bounds in the context of several problems, including non-graph problems. We present two applications by showing (almost) tight lower bounds for the round complexity of two fundamental graph problems, namely PageRank computation and triangle enumeration. Our approach, as demonstrated in the case of PageRank, can yield tight lower bounds for problems (including, and especially, under a stochastic partition of the input) where communication complexity techniques are not obvious. Our approach, as demonstrated in the case of triangle enumeration, can yield stronger round lower bounds as well as message-round tradeoffs compared to approaches that use communication complexity techniques.We then present distributed algorithms for PageRank and triangle enumeration with a round complexity that (almost) matches the respective lower bounds; these algorithms exhibit a round complexity which scales superlinearly in k, improving significantly over previous results for these problems [Klauck et al., SODA 2015]. Specifically, we show the following results:• PageRank: We show a lower bound ofΩ(n/k 2 ) rounds, and present a distributed algorithm that computes an approximation of the PageRank of all the nodes of a graph inÕ(n/k 2 ) rounds.• Triangle enumeration: We show that there exist graphs with m edges where any distributed algorithm requiresΩ(m/k 5/3 ) rounds. This result also implies the first non-trivial lower bound ofΩ(n 1/3 ) rounds for the congested clique model, which is tight up to logarithmic factors. We then present a distributed algorithm that enumerates all the triangles of a graph inÕ(m/k 5/3 + n/k 4/3 ) rounds.The focus of this paper is on the distributed processing of large-scale data, in particular, graph data, which is becoming increasingly important with the rise of massive graphs such as the Web graph, social networks, biological networks, and other graph-structured data and the consequent need for fast distributed algorithms to process such graphs. Several large-scale graph processi...
This article presents a randomized (Las Vegas) distributed algorithm that constructs a minimum spanning tree (MST) in weighted networks with optimal (up to polylogarithmic factors) time and message complexity. This algorithm runs in Õ( D + √ n ) time and exchanges Õ( m ) messages (both with high probability), where n is the number of nodes of the network, D is the hop-diameter, and m is the number of edges. This is the first distributed MST algorithm that matches simultaneously the time lower bound of Ω ˜ ( D + √ n ) [10] and the message lower bound of Ω ( m ) [31], which both apply to randomized Monte Carlo algorithms. The prior time and message lower bounds are derived using two completely different graph constructions; the existing lower-bound construction that shows one lower bound does not work for the other. To complement our algorithm, we present a new lower-bound graph construction for which any distributed MST algorithm requires both Ω ˜ ( D + √ n ) rounds and Ω ( m ) messages.
This paper presents a randomized (Las Vegas) distributed algorithm that constructs a minimum spanning tree (MST) in weighted networks with optimal (up to polylogarithmic factors) time and message complexity. This algorithm runs inÕ(D + √ n) time and exchangesÕ(m) messages (both with high probability), where n is the number of nodes of the network, D is the diameter, and m is the number of edges. This is the first distributed MST algorithm that matches simultaneously the time lower bound ofΩ(D + √ n) [Elkin, SIAM J. Comput. 2006] and the message lower bound of Ω(m) [Kutten et al., J. ACM 2015], which both apply to randomized Monte Carlo algorithms. The prior time and message lower bounds are derived using two completely different graph constructions; the existing lower bound construction that shows one lower bound does not work for the other. To complement our algorithm, we present a new lower bound graph construction for which any distributed MST algorithm requires bothΩ(D + √ n) rounds and Ω(m) messages. * A preliminary version of this paper [35] appeared in the The original algorithm has a message complexity of O(m log n), but it can be improved to O(m + n log n).message complexity of this algorithm is (essentially) optimal, 2 but its time complexity is not. Hence further research concentrated on improving the time complexity. The time complexity was first improved to O(n log log n) by Chin and Ting [5], further improved to O(n log * n) by Gafni [12], and then to O(n) by Awerbuch [2] (see also [11]). The O(n) bound is existentially optimal in the sense that there exist graphs for which this is the best possible. This was the state of the art till the mid-nineties when Garay, Kutten, and Peleg [14] raised the question of whether it is possible to identify graph parameters that can better capture the complexity of distributed network computations. In fact, for many existing networks, their diameter 3 D is significantly smaller than the number of vertices n, and therefore it is desirable to design protocols whose running time is bounded in terms of D rather than in terms of n. Garay, Kutten, and Peleg [14] gave the first such distributed algorithm for the MST problem with running time O(D + n 0.614 log * n), which was later improved by Kutten and Peleg [28] to O(D + √ n log * n).However, both these algorithms are not message-optimal, 4 as they exchange O(m + n 1.614 ) and O(m + n 1.5 ) messages, respectively. All the above results, as well as the one in this paper, hold in the synchronous CONGEST model of distributed computing, a well-studied standard model of distributed computing [37] (see Section 1.1). The lack of progress in improving the result of [28], and in particular breaking theÕ( √ n) barrier, 5 led to work on lower bounds for the distributed MST problem. Peleg and Rubinovich [38] showed that Ω(D + √ n/ log n) time is required by any distributed algorithm for constructing an MST, even on networks of small diameter (D = Ω(log n)); thus, this result establishes the asymptotic near-tight optimality of ...
Motivated by the increasing need to understand the algorithmic foundations of distributed large-scale graph computations, we study a number of fundamental graph problems in a messagepassing model for distributed computing where k ≥ 2 machines jointly perform computations on graphs with n nodes (typically, n k). The input graph is assumed to be initially randomly partitioned among the k machines, a common implementation in many real-world systems. Communication is point-to-point, and the goal is to minimize the number of communication rounds of the computation.Our main result is an (almost) optimal distributed randomized algorithm for graph connectivity. Our algorithm runs inÕ(n/k 2 ) rounds (Õ notation hides a polylog(n) factor and an additive polylog(n) term). This improves over the best previously known bound ofÕ(n/k) [Klauck et al., SODA 2015], and is optimal (up to a polylogarithmic factor) in view of an existing lower bound ofΩ(n/k 2 ). Our improved algorithm uses a bunch of techniques, including linear graph sketching, that prove useful in the design of efficient distributed graph algorithms. Using the connectivity algorithm as a building block, we then present fast randomized algorithms for computing minimum spanning trees, (approximate) min-cuts, and for many graph verification problems. All these algorithms takeÕ(n/k 2 ) rounds, and are optimal up to polylogarithmic factors. We also show an almost matching lower bound ofΩ(n/k 2 ) rounds for many graph verification problems by leveraging lower bounds in random-partition communication complexity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.