Distributed Hybrid CPU and GPU training for Graph Neural Networks on Billion-Scale Graphs

Dong, Zheng; Xiang, Shiming; Yang, Chengru; LaSalle, Dominique; Karypis, George

doi:10.48550/arxiv.2112.15345

Cited by 5 publications

(10 citation statements)

References 15 publications

(23 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Representations of large graphs can exceed CPU memory and either require distribution to multiple machines [12,19,37,40,49,50] or storing graph data on disk [22,29]. In disk-based training the node representations are divided into partitions that are swapped in-and-out of CPU memory.…”

Section: Scaling Gnn Training To Compute ℎ (𝑘)mentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

“…Large-Scale Training To scale GNN training to graphs which exceed the CPU memory capacity of a single box, many works opt for a distributed multi-machine approach [12,19,49]. In particular, recent work introduces DistDGLv2 as a distributed version of DGL [49]. DistDGLv2 utilizes METIS partitioning, co-location of data with mini-batch computation, and asynchronous mini-batch preparation to scale training.…”

Section: Related Workmentioning

confidence: 99%

“…To address this challenge, state-of-the-art GNN models are coupled with multi-hop sampling procedures to construct the necessary inputs for training [13,45]. Consequently, current system solutions for scalable training of GNNs, such as Deep Graph Library (DGL) [41] and PyTorch Geometric (PyG) [11], propose using distributed solutions-across machines or with multiple GPUs-to minimize the time required for sampling, mini-batch creation, and training [49].…”

Section: Introductionmentioning

confidence: 99%

“…In this paper, we focus on the problem of scalable training of GNN models but with an emphasis on resource efficiency. Multi-GPU training across multiple machines introduces significant monetary costs that make large-scale GNN training inaccessible to practitioners and researchers that do not have access to industrial resources [49]. At the same time, distributed solutions introduce deployment and maintenance overheads [25].…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks

Waleffe¹,

Mohoney²,

Ρεκατσίνας³

et al. 2022

Preprint

View full text Add to dashboard Cite

Graph Neural Networks (GNNs) have emerged as a powerful model for ML over graph-structured data. Yet, scalability remains a major challenge for using GNNs over billion-edge inputs. The creation of mini-batches used for training incurs computational and data movement costs that grow exponentially with the number of GNN layers as state-of-the-art models aggregate information from the multi-hop neighborhood of each input node. In this paper, we focus on scalable training of GNNs with emphasis on resource efficiency. We show that out-of-core pipelined mini-batch training in a single machine outperforms resource-hungry multi-GPU solutions. We introduce Marius++, a system for training GNNs over billion-scale graphs. Marius++ provides disk-optimized training for GNNs and introduces a series of data organization and algorithmic contributions that 1) minimize the memory-footprint and end-to-end time required for training and 2) ensure that models learned with diskbased training exhibit accuracy similar to those fully trained in mixed CPU/GPU settings. We evaluate Marius++ against PyTorch Geometric and Deep Graph Library using seven benchmark (model, data set) settings and find that Marius++ with one GPU can achieve the same level of model accuracy up to 8× faster than these systems when they are using up to eight GPUs. For these experiments, disk-based training allows Marius++ deployments to be up to 64× cheaper in monetary cost than those of the competing systems.

show abstract

Section: Scaling Gnn Training To Compute ℎ (𝑘)mentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks

Waleffe¹,

Mohoney²,

Ρεκατσίνας³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

Scalable algorithms for physics-informed neural and graph networks

Shukla

Trask

et al. 2022

DCE

View full text Add to dashboard Cite

Physics-informed machine learning (PIML) has emerged as a promising new approach for simulating complex physical and biological systems that are governed by complex multiscale processes for which some data are also available. In some instances, the objective is to discover part of the hidden physics from the available data, and PIML has been shown to be particularly effective for such problems for which conventional methods may fail. Unlike commercial machine learning where training of deep neural networks requires big data, in PIML big data are not available. Instead, we can train such networks from additional information obtained by employing the physical laws and evaluating them at random points in the space–time domain. Such PIML integrates multimodality and multifidelity data with mathematical models, and implements them using neural networks or graph networks. Here, we review some of the prevailing trends in embedding physics into machine learning, using physics-informed neural networks (PINNs) based primarily on feed-forward neural networks and automatic differentiation. For more complex systems or systems of systems and unstructured data, graph neural networks (GNNs) present some distinct advantages, and here we review how physics-informed learning can be accomplished with GNNs based on graph exterior calculus to construct differential operators; we refer to these architectures as physics-informed graph networks (PIGNs). We present representative examples for both forward and inverse problems and discuss what advances are needed to scale up PINNs, PIGNs and more broadly GNNs for large-scale engineering problems.

show abstract

Motif Prediction with Graph Neural Networks

Besta¹,

Grob²,

Miglioli³

et al. 2022

Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

View full text Add to dashboard Cite

Link prediction is one of the central problems in graph mining. However, recent studies highlight the importance of higher-order network analysis, where complex structures called motifs are the first-class citizens. We first show that existing link prediction schemes fail to effectively predict motifs. To alleviate this, we establish a general motif prediction problem and we propose several heuristics that assess the chances for a specified motif to appear. To make the scores realistic, our heuristics consider -among others -correlations between links, i.e., the potential impact of some arriving links on the appearance of other links in a given motif. Finally, for highest accuracy, we develop a graph neural network (GNN) architecture for motif prediction. Our architecture offers vertex features and sampling schemes that capture the rich structural properties of motifs. While our heuristics are fast and do not need any training, GNNs ensure highest accuracy of predicting motifs, both for dense (e.g., 𝑘-cliques) and for sparse ones (e.g., 𝑘-stars). We consistently outperform the best available competitor by more than 10% on average and up to 32% in area under the curve. Importantly, the advantages of our approach over schemes based on uncorrelated link prediction increase with the increasing motif size and complexity. We also successfully apply our architecture for predicting more arbitrary clusters and communities, illustrating its potential for graph mining beyond motif analysis. CCS CONCEPTS• Information systems → Data mining.

show abstract

Distributed Hybrid CPU and GPU training for Graph Neural Networks on Billion-Scale Graphs

Cited by 5 publications

References 15 publications

MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks

MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks

Scalable algorithms for physics-informed neural and graph networks

Motif Prediction with Graph Neural Networks

Contact Info

Product

Resources

About