GPUvm: GPU Virtualization at the Hypervisor

Suzuki, Yusuke; Kato, Shigeo; Yamada, Hiroshi; Kono, Kenji

doi:10.1109/tc.2015.2506582

Cited by 50 publications

(54 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…GPUvm [Suzuki et al 2014] implements both full and para virtualization in the Xen hypervisor by using a Nouveau driver [X.OrgFoundation 2011] in the guest OS side. To isolate multiple VMs on a GPU in full virtualization, GPUvm partitions both physical GPU memory and the MMIO region into several pieces and assigns each portion to an individual VM.…”

Section: Full Virtualizationmentioning

confidence: 99%

“…GPUvm [Suzuki et al 2014] employs the BAND scheduler of Gdev [Kato et al 2012] and solves a flaw of Credit scheduling. The original BAND scheduler distributes credits to each VM based on the assumption that the total utilization of all vGPUs can reach 100%.…”

Section: Algorithms For Scheduling a Single Gpumentioning

confidence: 99%

“…To fulfill this task, para and full virtualization frameworks, including LoGV [Gottschlag et al 2013], GPUvm [Suzuki et al 2014], gVirt , and gScale [Xue et al 2016], prevent a VM from mapping the GPU address spaces of other VMs. Despite this protection mechanism, GPU virtualization frameworks remain vulnerable to denial-of-service (DoS) attacks where a malicious VM uninterruptedly submits a massive number of GPU commands to the backend and thus jeopardizes the whole system.…”

Section: Challenges and Future Directionsmentioning

confidence: 99%

See 2 more Smart Citations

GPU Virtualization and Scheduling Methods

2017

View full text Add to dashboard Cite

The integration of graphics processing units (GPUs) on high-end compute nodes has established a new accelerator-based heterogeneous computing model, which now permeates high performance computing. The same paradigm nevertheless has limited adoption in cloud computing or other large-scale distributed computing paradigms. Heterogeneous computing with GPUs can benefit the Cloud by reducing operational costs and improving resource and energy efficiency. However, such a paradigm shift would require effective methods for virtualizing GPUs, as well as other accelerators. In this survey paper, we present an extensive and in-depth survey of GPU virtualization techniques and their scheduling methods. We review a wide range of virtualization techniques implemented at the GPU library, driver, and hardware levels. Furthermore, we review GPU scheduling methods that address performance and fairness issues between multiple virtual machines sharing GPUs. We believe that our survey delivers a perspective on the challenges and opportunities for virtualization of heterogeneous computing environments.

show abstract

Section: Full Virtualizationmentioning

confidence: 99%

Section: Algorithms For Scheduling a Single Gpumentioning

confidence: 99%

Section: Challenges and Future Directionsmentioning

confidence: 99%

See 1 more Smart Citation

GPU Virtualization and Scheduling Methods

2017

View full text Add to dashboard Cite

show abstract

“…First, small (e.g., sub-millisecond level) but frequent GPU requests, which are common in a wide range of classical HPC applications and emerging applications in real-time analytics, can burden virtualization stacks by frequent context switching between user and hypervisor spaces. Previous research invokes system or hypervisor calls on every GPU request [6], [7], [8], [9], [10], [11], [12], [13], which causes significant per-request trapping costs for small GPU requests. Second, workloads with high CPU-GPU interactivity can cause synchronization bottlenecks between the CPU and GPU schedulers.…”

Section: Introductionmentioning

confidence: 99%

“…However, co-scheduling in itself fails to achieve good fairness because of interference between the CPU and GPU schedulers. Finally, variable request sizes in mixed workloads, which are challenging to handle on non-preemptive GPUs, are either ignored [6], [10], or addressed with reverse engineering methods which are not supported by many GPUs [8], [9], [12], [14]. None of the existing GPU virtualization solutions takes all these factors into account, thus failing to attain acceptable fairness combined with strong performance isolation.…”

Section: Introductionmentioning

confidence: 99%

FairGV: Fair and Fast GPU Virtualization

Hong

Spence

Nikolopoulos

2017

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Abstract-Increasingly high performance computing (HPC) application developers are opting to use cloud resources due to higher availability. Virtualized GPUs would be an obvious and attractive option for HPC application developers using cloud hosting services. Unfortunately, existing GPU virtualization software is not ready to address fairness, utilization, and performance limitations associated with consolidating mixed HPC workloads. This paper presents FairGV, a radically redesigned GPU virtualization system that achieves system-wide weighted fair sharing and strong performance isolation in mixed workloads that use GPUs with variable degrees of intensity. To achieve its objectives, FairGV introduces a trap-less GPU processing architecture, a new fair queuing method integrated with work-conserving and GPU-centric coscheduling polices, and a collaborative scheduling method for non-preemptive GPUs. Our prototype implementation achieves near ideal fairness (≥ 0.97 Min-Max Ratio) with little performance degradation (≤ 1.02 aggregated overhead) in a range of mixed HPC workloads that leverage GPUs.

show abstract

Redesigning the rCUDA communication layer for a better adaptation to the underlying hardware

Reaño

Silla

2019

Concurrency and Computation

View full text Add to dashboard Cite

The use of Graphics Processing Units (GPUs) has become a very popular way to accelerate the execution of many applications. However, GPUs are not exempt from side effects. For instance, GPUs are expensive devices which additionally consume a non-negligible amount of energy even when they are not performing any computation. Furthermore, most applications present low GPU utilization. To address these concerns, the use of GPU virtualization has been proposed. In particular, remote GPU virtualization is a promising technology that allows applications to transparently leverage GPUs installed in any node of the cluster. In this paper the remote GPU virtualization mechanism is comparatively analyzed across three different generations of GPUs. The first contribution of this study is an analysis about how the performance of the remote GPU virtualization technique is impacted by the underlying hardware. To that end, the Tesla K20, Tesla K40 and Tesla P100 GPUs along with FDR and EDR InfiniBand fabrics are used in the study. The analysis is performed in the context of the rCUDA middleware. It is clearly shown that the GPU virtualization middleware requires a comprehensive design of its communication layer, which should be perfectly adapted to every hardware generation in order to avoid a reduction in performance. This is precisely the second contribution of this work: redesigning the rCUDA communication layer in order to improve the management of the underlying hardware. Results show that it is possible to improve bandwidth up to 29.43%, which translates into up to 4.81% average less execution time in the performance of the analyzed applications.

show abstract

GPUvm: GPU Virtualization at the Hypervisor

Cited by 50 publications

References 27 publications

GPU Virtualization and Scheduling Methods

GPU Virtualization and Scheduling Methods

FairGV: Fair and Fast GPU Virtualization

Redesigning the rCUDA communication layer for a better adaptation to the underlying hardware

Contact Info

Product

Resources

About