2021
DOI: 10.1109/tpds.2021.3138856
|View full text |Cite
|
Sign up to set email alerts
|

Accelerating Large Sparse Neural Network Inference using GPU Task Graph Parallelism

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 11 publications
0
1
0
Order By: Relevance
“…At a particular VLSI timing analysis example, Heteroflow can reduce a baseline runtime from 99 minutes to 13 minutes (7.7× speed-up) on a machine of 40 CPU cores and 4 GPUs. Future work will focus on distributing our scheduler based on [46] and incorporating a broader range of workloads, including machine learning [47], [48] and engineering simulation [49], [50], [51].…”
Section: Discussionmentioning
confidence: 99%
“…At a particular VLSI timing analysis example, Heteroflow can reduce a baseline runtime from 99 minutes to 13 minutes (7.7× speed-up) on a machine of 40 CPU cores and 4 GPUs. Future work will focus on distributing our scheduler based on [46] and incorporating a broader range of workloads, including machine learning [47], [48] and engineering simulation [49], [50], [51].…”
Section: Discussionmentioning
confidence: 99%