GPES: a preemptive execution system for GPGPU computing

Zhou, Husheng; Tong, Guangmo; Liu, Cong

doi:10.1109/rtas.2015.7108420

Cited by 53 publications

(21 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This situation can cause unfairness between multiple kernels and significantly deteriorate the system responsiveness. Existing GPU scheduling methods address this issue by either killing a long running kernel [Menychtas et al 2014] or providing a kernel split tool [Basaran and Kang 2012;Zhou et al 2015;Margiolas and O'Boyle 2016]. The Pascal architecture allows GPU kernels to be interrupted at instruction-level granularity by saving and restoring each GPU context to and from the GPU's DRAM.…”

Section: Algorithms For Scheduling a Single Gpumentioning

confidence: 99%

GPU Virtualization and Scheduling Methods

2017

View full text Add to dashboard Cite

The integration of graphics processing units (GPUs) on high-end compute nodes has established a new accelerator-based heterogeneous computing model, which now permeates high performance computing. The same paradigm nevertheless has limited adoption in cloud computing or other large-scale distributed computing paradigms. Heterogeneous computing with GPUs can benefit the Cloud by reducing operational costs and improving resource and energy efficiency. However, such a paradigm shift would require effective methods for virtualizing GPUs, as well as other accelerators. In this survey paper, we present an extensive and in-depth survey of GPU virtualization techniques and their scheduling methods. We review a wide range of virtualization techniques implemented at the GPU library, driver, and hardware levels. Furthermore, we review GPU scheduling methods that address performance and fairness issues between multiple virtual machines sharing GPUs. We believe that our survey delivers a perspective on the challenges and opportunities for virtualization of heterogeneous computing environments.

show abstract

Section: Algorithms For Scheduling a Single Gpumentioning

confidence: 99%

GPU Virtualization and Scheduling Methods

2017

View full text Add to dashboard Cite

show abstract

“…The literature shows this technique to be feasible with low performance overhead and significant benefits in terms of responsiveness [36]. We have adopted the implementation details from GPES [37]. When a vGPU has a long running kernel, FairGV divides the kernel into a set of sub-kernels so that a sub-kernel is executed by a specified number of thread blocks.…”

Section: Collaborative Schedulingmentioning

confidence: 99%

FairGV: Fair and Fast GPU Virtualization

Hong

Spence

Nikolopoulos

2017

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Abstract-Increasingly high performance computing (HPC) application developers are opting to use cloud resources due to higher availability. Virtualized GPUs would be an obvious and attractive option for HPC application developers using cloud hosting services. Unfortunately, existing GPU virtualization software is not ready to address fairness, utilization, and performance limitations associated with consolidating mixed HPC workloads. This paper presents FairGV, a radically redesigned GPU virtualization system that achieves system-wide weighted fair sharing and strong performance isolation in mixed workloads that use GPUs with variable degrees of intensity. To achieve its objectives, FairGV introduces a trap-less GPU processing architecture, a new fair queuing method integrated with work-conserving and GPU-centric coscheduling polices, and a collaborative scheduling method for non-preemptive GPUs. Our prototype implementation achieves near ideal fairness (≥ 0.97 Min-Max Ratio) with little performance degradation (≤ 1.02 aggregated overhead) in a range of mixed HPC workloads that leverage GPUs.

show abstract

“…The scheme has been studied in real-time community, represented by the recent proposals of GPES [7], and PKM [6]. A challenge in this scheme is to select the appropriate slice size.…”

Section: Related Workmentioning

confidence: 99%

“…The second class of work is about granularity. They use kernel slicing [6,7] to break one GPU kernel into many smaller ones. The reduced granularity increases the flexibility in kernel scheduling, and may help shorten the time that a kernel has to wait before it can get launched.…”

Section: Introductionmentioning

confidence: 99%

“…The length of the delay depends on the length of the running kernel. To reduce the delay, some prior proposals (e.g., kernel slicing [6,7]) attempt to split a kernel into many smaller kernels such that the wait for a kernel to finish gets shortened. They however face a dilemma: The resulting increased number of kernel launches and the reduced parallelism in the smaller kernels often cause some substantial performance loss (as much as 58% shown in our experiments in Section 8).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

EffiSha

et al. 2017

View full text Add to dashboard Cite

Modern GPUs are broadly adopted in many multitasking environments, including data centers and smartphones. However, the current support for the scheduling of multiple GPU kernels (from different applications) is limited, forming a major barrier for GPU to meet many practical needs. This work for the first time demonstrates that on existing GPUs, efficient preemptive scheduling of GPU kernels is possible even without special hardware support. Specifically, it presents EffiSha, a pure software framework that enables preemptive scheduling of GPU kernels with very low overhead. The enabled preemptive scheduler offers flexible support of kernels of different priorities, and demonstrates significant potential for reducing the average turnaround time and improving the system overall throughput of programs that time share a modern GPU.

show abstract

GPES: a preemptive execution system for GPGPU computing

Cited by 53 publications

References 12 publications

GPU Virtualization and Scheduling Methods

GPU Virtualization and Scheduling Methods

FairGV: Fair and Fast GPU Virtualization

EffiSha

Contact Info

Product

Resources

About