2021
DOI: 10.48550/arxiv.2111.14522
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Understanding over-squashing and bottlenecks on graphs via curvature

Abstract: Most graph neural networks (GNNs) use the message passing paradigm, in which node features are propagated on the input graph. Recent works pointed to the distortion of information flowing from distant nodes as a factor limiting the efficiency of message passing for tasks relying on long-distance interactions. This phenomenon, referred to as 'over-squashing', has been heuristically attributed to graph bottlenecks where the number of k-hop neighbors grows rapidly with k. We provide a precise description of the o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
71
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
4

Relationship

2
7

Authors

Journals

citations
Cited by 30 publications
(72 citation statements)
references
References 27 publications
1
71
0
Order By: Relevance
“…Conveniently, the fully connected view also encompasses spectrally defined graph convolutions, such as the graph Fourier transform (Bruna et al, 2013). Nontrivial changes to N u , such as multi-hop layers (Defferrard et al, 2016), rewiring based on diffusion (Klicpera et al, 2019) or curvature (Topping et al, 2021), and subsampling (Hamilton et al, 2017) are also supported. Lastly, the methods which dynamically alter the adjacency in a learnable fashion (Kipf et al, 2018;Wang et al, 2019;Kazi et al, 2020; can also be classified under this umbrella.…”
Section: Graph Rewiringmentioning
confidence: 99%
“…Conveniently, the fully connected view also encompasses spectrally defined graph convolutions, such as the graph Fourier transform (Bruna et al, 2013). Nontrivial changes to N u , such as multi-hop layers (Defferrard et al, 2016), rewiring based on diffusion (Klicpera et al, 2019) or curvature (Topping et al, 2021), and subsampling (Hamilton et al, 2017) are also supported. Lastly, the methods which dynamically alter the adjacency in a learnable fashion (Kipf et al, 2018;Wang et al, 2019;Kazi et al, 2020; can also be classified under this umbrella.…”
Section: Graph Rewiringmentioning
confidence: 99%
“…Differential equations have historically played a role in designing and interpreting various algorithms in machine learning, including non-linear dimensionality reduction methods Belkin & Niyogi (2003); Coifman & Lafon (2006) Chamberlain et al (2021b) used parabolic diffusion-type PDEs to design GNNs using graph gradient and divergence operators as the spatial differential operator, a transformer type-attention as a learnable diffusivity function ('1-neighborhood coupling' in our terminology), and a variety of time stepping schemes to discretize the temporal dimension in this framework. Chamberlain et al (2021a) applied a non-euclidean diffusion equation ('Beltrami flow') to a joint positional-feature space, yielding a scheme with adaptive spatial derivatives ('graph rewiring'), and Topping et al (2021) studied a discrete geometric PDE similar to Ricci flow to improve information propagation in GNNs. We can see the contrast between the diffusionbased methods of Chamberlain et al (2021b,a) and GraphCON in the simple case of identity activation σ(x) = x and no residual connection (W = 0 and b = 0).…”
Section: Related Workmentioning
confidence: 99%
“…Several recent works proposed Graph ML models based on differential equations coming from physics Avelar et al (2019); Poli et al (2019b); Zhuang et al (2020); Xhonneux et al (2020b), including diffusion Chamberlain et al (2021b) and wave Eliasof et al (2021) equations and geometric equations such as Beltrami Chamberlain et al (2021a) and Ricci Topping et al (2021) flows. Such approaches allow not only to recover popular GNN models as discretization schemes for the underling differential equations, but also, in some cases, can address problems encountered in traditional GNNs such as oversmoothing Nt & Maehara (2019); Oono & Suzuki (2020) and bottlenecks Alon & Yahav (2021).…”
Section: Introductionmentioning
confidence: 99%
“…For example, the expressiveness of such GNN is bounded by the Weisfeiler-Lehman isomorphism hierarchy [23]. Also, GNNs are known to suffer from over-squashing [24], where there is a distortion of information propagation between distant nodes. Due to these limitations, the node embeddings created by GNN have limited expressiveness.…”
Section: A Challenges In Device Placementmentioning
confidence: 99%