Proceedings of the 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming 2017
DOI: 10.1145/3018743.3018754
|View full text |Cite
|
Sign up to set email alerts
|

Pagoda

Abstract: Massively multithreaded GPUs achieve high throughput by running thousands of threads in parallel. To fully utilize the hardware, workloads spawn work to the GPU in bulk by launching large tasks, where each task is a kernel that contains thousands of threads that occupy the entire GPU.GPUs face severe underutilization and their performance benefits vanish if the tasks are narrow, i.e., they contain < 500 threads. Latency-sensitive applications in network, signal, and image processing that generate a large numbe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 22 publications
references
References 35 publications
0
0
0
Order By: Relevance