2021
DOI: 10.2172/1817326
|View full text |Cite
|
Sign up to set email alerts
|

A Taxonomy for Classification and Comparison of Dataflows for GNN Accelerators

Abstract: Recently, Graph Neural Networks (GNNs) have received a lot of interest because of their success in learning representations from graph structured data. However, GNNs exhibit different compute and memory characteristics compared to traditional Deep Neural Networks (DNNs). Graph convolutions require feature aggregations from neighboring nodes (known as the aggregation phase), which leads to highly irregular data accesses. GNNs also have a very regular compute phase that can be broken down to matrix multiplicatio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 36 publications
0
7
0
Order By: Relevance
“…For (b) representation, the neighbors of a particular vertex are stored back-to-back because they are highly sparse. (b) is often used as the graph representation [5,6]. In this study, we used (b) to describe the adjacency matrix comprising an edge array Fig.…”
Section: Gnnsmentioning
confidence: 99%
“…For (b) representation, the neighbors of a particular vertex are stored back-to-back because they are highly sparse. (b) is often used as the graph representation [5,6]. In this study, we used (b) to describe the adjacency matrix comprising an edge array Fig.…”
Section: Gnnsmentioning
confidence: 99%
“…For instance, HyGCN [12] explored both intra/inter-vertex parallelisms to separately handle the irregularity in the aggregation phase and reusability in the combination phase. Later, aiming to boost the overall hardware utilization, AWB-GCN [5] proposed to balance the workload during runtime with an auto-tuning algorithm and to increase the data locality by regionally clustering the non-zero values (i.e., connected neighbors) within the adjacency matrices; EnGN [45] proposed a ring-edge-reduce dataflow to handle graphs with arbitrary dimensions and increase the accelerator's scalability to large graphs; and GRIP [46] employed finegrained vertex-tiling to reduce the weight bandwidth requirements; In parallel, to reduce the human efforts in designing GNN accelerators and democratize the process, pioneering works have attempted to characterize the design space of dataflows and micro-architectures for GNN accelerators [13], and developed an automated framework to generate GNN accelerators [14]. Nevertheless, existing automated frameworks for GNNs still have limited support to various GNN structures and thus suffer from low hardware utilization and achievable efficiency on certain tasks.…”
Section: Related Workmentioning
confidence: 99%
“…For example, HyGCN [12] proposes hybrid execution patterns of GNNs for leveraging their intra-vertex and inter-vertex parallelisms to handle the irregularity in the aggregation phase and reusability in the combination phase, respectively; Later, AWB-GCN [5] identifies the workload imbalance problem in the aggregation phase, and proposes auto-tuning workload balancing techniques, achieving an average speedup of 7.4× over HyGCN. On the development tool level, pioneering works have attempted to characterize the design space of dataflows and micro-architectures for GNN accelerators [13], and develop an automated framework to generate GNN accelerators [14].…”
Section: Introductionmentioning
confidence: 99%
“…Later, AWB-GCN [13] identifies the workload imbalance problem in the aggregation phase, the non-zero values (i.e., connected neighbors) in adjacency matrices are regionally clustered, and proposes autotuning workload balancing techniques to alleviate the runtime imbalance. Another trend is to summarize the design space of the dataflow and microarchitecture optimization in GCN accelerators [12], and provide automated framework to generate suitable hardware for the given GCN applications [24], [50]. For example, G-CoS [50] develops the first co-search framework that can automatically search for the matched GNN structures and accelerators to maximize both task accuracy and acceleration efficiency.…”
Section: Related Workmentioning
confidence: 99%
“…Why GCN Inference Is Inefficient. There exists a fundamental dilemma associated with GCN inference ac-celeration: To accelerate GCN inference, the irregularity of GCNs' adjacency matrices need to be reduced, which can inevitably degrade the inference accuracy; On the other hand, maintaining GCNs' irregularity and thus their excellent accuracy can lead to extremely high hardware costs of GCN inference as demonstrated in recent works [42], [13], [25], [24], [20], [7], [12], [47]; both limiting their more extensive applications.…”
Section: Gcod: Motivation and Overviewmentioning
confidence: 99%