2014
DOI: 10.1145/2678373.2665702
|View full text |Cite
|
Sign up to set email alerts
|

Enabling preemptive multiprogramming on GPUs

Abstract: GPUs are being increasingly adopted as compute accelerators in many domains, spanning environments from mobile systems to cloud computing. These systems are usually running multiple applications, from one or several users. However GPUs do not provide the support for resource sharing traditionally expected in these scenarios. Thus, such systems are unable to provide key multiprogrammed workload requirements, such as responsiveness, fairness or quality of service.In this paper, we propose a set of hardware exten… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
38
0
1

Year Published

2017
2017
2023
2023

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 64 publications
(39 citation statements)
references
References 27 publications
0
38
0
1
Order By: Relevance
“…Finally, GPUs were non-preemptive until recently, which means a long running GPU kernel cannot be preempted by software until it finishes. This will cause unfairness between multiple kernels and severely deteriorate the responsiveness of latency-critical kernels [Tanasic et al 2014]. Currently, a new GPU architecture to support GPU kernel preemption has emerged in the market [NVIDIA 2016a], but it is expected that existing GPUs will continue to suffer from this issue.…”
Section: Scheduling Methodsmentioning
confidence: 99%
“…Finally, GPUs were non-preemptive until recently, which means a long running GPU kernel cannot be preempted by software until it finishes. This will cause unfairness between multiple kernels and severely deteriorate the responsiveness of latency-critical kernels [Tanasic et al 2014]. Currently, a new GPU architecture to support GPU kernel preemption has emerged in the market [NVIDIA 2016a], but it is expected that existing GPUs will continue to suffer from this issue.…”
Section: Scheduling Methodsmentioning
confidence: 99%
“…To overcome this limitation, GPU architectures that support hardwarebased preemption were suggested, and finally preemptive GPUs have emerged in the market recently [4]. However, context switching in such GPUs is very expensive and recent research [5] reports that hardware-based preemption decreases the total throughput up to 35% in a wide range of GPU applications. Owing to this reason, we expect that certain HPC clouds will still employ non-preemptive GPUs for throughput-sensitive applications.…”
Section: Non-preemptive Schedulingmentioning
confidence: 99%
“…However, as a GPU is composed of massive computation cores each of which has its own context, the amount of data to be saved and restored at a single time reaches to several hundreds of KB. Unfortunately, this causes significant throughput degradation up to 35% in GPU applications [5]. Therefore, we believe that non-preemptive GPUs are still relevant for performance-sensitive applications, and addressing non-preemptive scheduling remains an important issue.…”
Section: Introductionmentioning
confidence: 99%
“…Hardware support for preemption has been proposed for Nvidia GPUs, as well as SM-draining whereby workgroups occupying a symmetric multiprocessor (SM; a compute unit using our terminology) are allowed to complete until the SM becomes free for other tasks [28]. SM draining is limited the presence of blocking constructs, since it may not be possible to drain a blocked workgroup.…”
Section: Related Workmentioning
confidence: 99%