Measuring and Relieving the Over-smoothing Problem for Graph Neural Networks from the Topological View

Chen, Deli; Lin, Yankai; Li, Wei; Li, Peng; Zhou, Jie; Sun, Xu

doi:10.48550/arxiv.1909.03211

Cited by 17 publications

(30 citation statements)

References 22 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(2) The performance improvements of AdaGNN-R and AdaGNN-S are more obvious on BlogCatalog and Flickr compared with other datasets. The reason could be attributed to their high average node degree of social network (as indicated in Table 1) -nodes are influenced by more neighbors during neighborhood aggregation, which is consistent with the observations in previous literature [6]. For such dataset with high average node degree, the adaptive frequency response provided by AdaGNN can help achieve more appropriate feature smoothness.…”

Section: Resultssupporting

confidence: 82%

“…Li et al [29] proved that GCN is actually a kind of Laplacian smoothing process, and proposed the challenge of oversmoothing for the first time. After that, some studies demonstrate that certain level of smoothness benefits node representation learning while over-smoothing broadly exists in deeper GNNs [6,12]. More recently, some researches attempt to relieve this problem via residual-like connections [8,31,32].…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

AdaGNN

Dong

Ding

Jalaian

et al. 2021

Proceedings of the 30th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

Graph Neural Networks have recently become a prevailing paradigm for various high-impact graph analytical problems. Existing efforts can be mainly categorized as spectral-based and spatialbased methods. The major challenge for the former is to find an appropriate graph filter to distill discriminative information from input signals for learning. Recently, myriads of explorations are made to achieve better graph filters, e.g., Graph Convolutional Network (GCN), which leverages Chebyshev polynomial truncation to seek an approximation of graph filters and bridge these two families of methods. Nevertheless, it has been shown in recent studies that GCN and its variants are essentially employing fixed low-pass filters to perform information denoising. Thus their learning capability is rather limited and may over-smooth node representations at deeper layers. To tackle these problems, we develop a novel graph neural network framework AdaGNN with a well-designed adaptive frequency response filter. At its core, AdaGNN leverages a simple but elegant trainable filter that spans across multiple layers to capture the varying importance of different frequency components for node representation learning. The inherent differences among different feature channels are also well captured by the filter. As such, it empowers AdaGNN with stronger expressiveness and naturally alleviates the over-smoothing problem. We empirically validate the effectiveness of the proposed framework on various benchmark datasets. Theoretical analysis is also provided to show the superiority of the proposed AdaGNN. The open-source implementation of AdaGNN can be found here: https://github.com/yushundong/AdaGNN.

show abstract

Section: Resultssupporting

confidence: 82%

Section: Resultsmentioning

confidence: 99%

AdaGNN

Dong

Ding

Jalaian

et al. 2021

Proceedings of the 30th ACM International Conference on Information &Amp; Knowledge Management

View full text Add to dashboard Cite

show abstract

“…Node pair distance has been widely adopted to quantify the over-smoothing based on embedding similarities [18,22]. Among the series of distance metrics, Dirichlet energy is simple and expressive for the over-smoothing analysis [32].…”

Section: Dirichlet Energy Constrained Learningmentioning

confidence: 99%

“…As the layer number increases, the node representations will converge to indistinguishable vectors due to the recursive neighborhood aggregation and non-linear activation [15,16]. Such phenomenon is recognized as over-smoothing issue [17,18,19,20,21]. It prevents the stacking of many layers and modeling the dependencies to high-order neighbors.…”

Section: Introductionmentioning

confidence: 99%

Dirichlet Energy Constrained Learning for Deep Graph Neural Networks

Zhou,

Huang,

Zha

et al. 2021

Preprint

View full text Add to dashboard Cite

Graph neural networks (GNNs) integrate deep architectures and topological structure modeling in an effective way. However, the performance of existing GNNs would decrease significantly when they stack many layers, because of the oversmoothing issue. Node embeddings tend to converge to similar vectors when GNNs keep recursively aggregating the representations of neighbors. To enable deep GNNs, several methods have been explored recently. But they are developed from either techniques in convolutional neural networks or heuristic strategies. There is no generalizable and theoretical principle to guide the design of deep GNNs. To this end, we analyze the bottleneck of deep GNNs by leveraging the Dirichlet energy of node embeddings, and propose a generalizable principle to guide the training of deep GNNs. Based on it, a novel deep GNN framework -EGNN is designed. It could provide lower and upper constraints in terms of Dirichlet energy at each layer to avoid over-smoothing. Experimental results demonstrate that EGNN achieves state-of-the-art performance by using deep layers.Preprint. Under review.

show abstract

“…This, however, is not always successful in capturing information from far-away nodes as information can be aggregated from too many nodes, drowning out relevant contributions. This is the over-smoothing phenomenon [18,19] which, for excessively large aggregation ranges, can produce output features that are very similar across the different nodes since the aggregation ranges of the different nodes overlap strongly.…”

Section: Related Workmentioning

confidence: 99%

On Local Aggregation in Heterophilic Graphs

Mostafa,

Nassar,

Majumdar

2021

Preprint

View full text Add to dashboard Cite

Many recent works have studied the performance of Graph Neural Networks (GNNs) in the context of graph homophily -a label-dependent measure of connectivity. Traditional GNNs generate node embeddings by aggregating information from a node's neighbors in the graph. Recent results in node classification tasks show that this local aggregation approach performs poorly in graphs with low homophily (heterophilic graphs). Several mechanisms have been proposed to improve the accuracy of GNNs on such graphs by increasing the aggregation range of a GNN layer, either through multi-hop aggregation, or through long-range aggregation from distant nodes. In this paper, we show that properly tuned classical GNNs and multi-layer perceptrons match or exceed the accuracy of recent long-range aggregation methods on heterophilic graphs. Thus, our results highlight the need for alternative datasets to benchmark long-range GNN aggregation mechanisms. We also show that homophily is a poor measure of the information in a node's local neighborhood and propose the Neighborhood Information Content (NIC) metric, which is a novel information-theoretic graph metric. We argue that NIC is more relevant for local aggregation methods as used by GNNs. We show that, empirically, it correlates better with GNN accuracy in node classification tasks than homophily.

show abstract

Measuring and Relieving the Over-smoothing Problem for Graph Neural Networks from the Topological View

Cited by 17 publications

References 22 publications

AdaGNN

AdaGNN

Dirichlet Energy Constrained Learning for Deep Graph Neural Networks

On Local Aggregation in Heterophilic Graphs

Contact Info

Product

Resources

About