Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining 2019
DOI: 10.1145/3292500.3330925
|View full text |Cite
|
Sign up to set email alerts
|

Cluster-GCN

Abstract: Graph convolutional network (GCN) has been successfully applied to many graph-based applications; however, training a large-scale GCN remains challenging. Current SGD-based algorithms suffer from either a high computational cost that exponentially grows with number of GCN layers, or a large space requirement for keeping the entire graph and the embedding of each node in memory. In this paper, we propose Cluster-GCN, a novel GCN algorithm that is suitable for SGD-based training by exploiting the graph clusterin… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
109
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 705 publications
(112 citation statements)
references
References 9 publications
1
109
0
Order By: Relevance
“…Generally all deep learning models are trained by stochastic gradient descent which requires us to minibatch our graphs. To breakdown a large graph for memory purposes and fast efficient training, we follow the algorithm in [9] which is implemented in PyTorch Geometric. For the inductive task, we break up our training graph into 4000 roughly equal subgraphs and use a batch size of 256 graphs.…”
Section: Methodsmentioning
confidence: 99%
“…Generally all deep learning models are trained by stochastic gradient descent which requires us to minibatch our graphs. To breakdown a large graph for memory purposes and fast efficient training, we follow the algorithm in [9] which is implemented in PyTorch Geometric. For the inductive task, we break up our training graph into 4000 roughly equal subgraphs and use a batch size of 256 graphs.…”
Section: Methodsmentioning
confidence: 99%
“…What's more, they just sample nodes or edges from the whole graph to form the minibatches in each layer iteratively (Zeng et al, 2020), which causes "neighbor explosion" problem and leads to a large computation complexity. To solve this problem, some heuristic based methods which make subgraph sampling as a preprocessing step have been proposed (Chiang et al, 2019;Zeng et al, 2019). But the sampling strategies of these methods introduce the non-identical node sampling probability and bias.…”
Section: Spectral Methods Define the Convolution Operator Based Onmentioning
confidence: 99%
“…But it has limitations on scaling to large graphs and is difficult to be trained in a minibatch setting (Zeng et al, 2020). To address the problem of scaling GCN to large graphs, layer sampling methods (Chen et al, 2018a;Ying et al, 2018;Chen et al, 2018b;Gao et al, 2018;Huang et al, 2018) and subgraph sampling methods (Chiang et al, 2019;Zeng et al, 2019;Zeng et al, 2020) have been proposed. They are designed for efficient minibatch training for GCN (Zeng et al, 2020).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In this experiment, λ 1 is set to 0.004 and λ 2 is set to 0.05 according the grid search on λ 1 ∈ {0, 0.002, 0.004, 0.006, 0.008, 0.01} and λ 2 ∈ {0.01, 0.02, 0.03, 0.04, 0.05}. For all GNN models compared in this experiment, the updated news representation and [8] 97.3 ± 0.2 -FastGCN [24] -93.7 GeniePath [42] 98.5 -Cluster-GCN [43] 99.36 96.60 GaAN [25] 98.71 ± 0.02 96.83±0. 03 GAIN(ours) 99.37 ± 0.01 97.03±0.03 user representation are then concatenated, and this concatenated vector is fed through 3 fully connected layers to get the output.…”
Section: Experimental Set-upmentioning
confidence: 99%