This is a survey of the exciting recent progress made in understanding the complexity of distributed subgraph finding problems. It overviews the results and techniques for assorted variants of subgraph finding problems in various models of distributed computing, and states intriguing open questions. This version contains some updates over the ICALP 2021 version, and I will try to keep updating it as additional progress is made.However, it is possible to do better, as we overview in this section.
Triangle Finding in the CLIQUE ModelWe begin with the CLIQUE model.Triangle listing in the CLIQUE model. The first non-trivial algorithm for triangle finding is due to Dolev, Lenzen, and Peled [DLP12]. This is a deterministic triangle listing algorithm for the CLIQUE model, which has a complexity of O(n 1/3 / log n) rounds. The simplicity of this algorithm turned out to be a huge advantage for later additional results, as we will see. The algorithm works as follows: The vertices of the graph are partitioned into n 1/3 subsets S 1 , . . . , S n 1/3 , each of n 2/3 nodes. Each of the n nodes receives a different tuple of three of these subsets. A node that receives S i 1 , S i 2 , S i 3 for indices 1 ≤ i 1 , i 2 , i 3 ≤ n 1/3 (that are not necessarily different) collects all edges with one endpoint in one of the three subsets and one endpoint in another, that is, this node collects all edges in E(S i 1 , S i 2 ) ∪ E(S i 1 , S i 3 ) ∪ E(S i 2 , S i 3 ), and reports all triangles that it finds. It is straightforward to see that all triangles are listed by this algorithm since the number of 3-tuples of subsets is n and so each is handled by some node.The round complexity of the algorithm follows by proving that each node needs to send and receive O(n 4/3 ) edges in total, which are to and from locations that are known to all nodes (we will discuss this knowledge property later), since the partition to subsets is hardcoded and so it is known to all nodes. Sending: Take a node v and assume that it is in the subset S i . There can be at most n 2/3 edges between v and nodes in S j and these edges need to be sent to all nodes that have S i and S j in their 3-tuple. Since there are n 1/3 such 3-tuples, these n 2/3 edges need to be sent to n 1/3 nodes. Repeating this for all n 1/3 possibilities for j gives a total of n 2/3+1/3+1/3 = n 4/3 edges that v has to send. Receiving: Each node needs to learn 3 subsets of edges, each containing at most n 2/3 • n 2/3 = n 4/3 edges. To conclude the complexity analysis, one can use the simple claim that [DLP12] proves, which states that a routing task in which each node needs to send and receive n messages in a known pattern can be done in 2 rounds. This means that the O(n 4/3 ) sent and received messages per node are divided by n, yielding a complexity of O(n 1/3 ) rounds. Noticing that the partition and routing are fixed, one can refrain from sending actual edge identifiers and replace them with a bit mask, which saves a logarithmic factor and results in a complexity of O(n 1/3 / log n) rounds....