GPU Virtualization and Scheduling Methods

Hong, Cheol-Ho; Spence, Ivor; Nikolopoulos, Dimitrios S.

doi:10.1145/3068281

Cited by 85 publications

(64 citation statements)

References 127 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…In fog/edge computing, containers are widely used because they realize lightweight virtualization. However, efficient GPU resource management in containers has not been explored sufficiently, compared to research in virtual machines [100]. In fog/edge devices, GPUs can be used for data analytics and to assist deep learning algorithms.…”

Section: Discussionmentioning

confidence: 99%

Resource Management in Fog/Edge Computing

2019

Self Cite

View full text Add to dashboard Cite

Contrary to using distant and centralized cloud data center resources, employing decentralized resources at the edge of a network for processing data closer to user devices, such as smartphones and tablets, is an upcoming computing paradigm, referred to as fog/edge computing. Fog/edge resources are typically resource-constrained, heterogeneous, and dynamic compared to the cloud, thereby making resource management an important challenge that needs to be addressed. This article reviews publications as early as 1991, with 85% of the publications between 2013-2018, to identify and classify the architectures, infrastructure, and underlying algorithms for managing resources in fog/edge computing.

show abstract

Section: Discussionmentioning

confidence: 99%

Resource Management in Fog/Edge Computing

2019

Self Cite

View full text Add to dashboard Cite

show abstract

“…However, the special case happens when t = δ i because at this time, if task i has not completed, it is dropped. For the purposes of calculating PCT (i, j) using Equation 5, PCT (i − 1, j) is guaranteed to be complete by its deadline. Therefore, as Equation 5 shows, all the impulses after δ i are aggregated into the impulse at t = δ i .…”

Section: Calculating Task Completion Time In the Presence Of Taskmentioning

confidence: 99%

Robust Dynamic Resource Allocation via Probabilistic Task Pruning in Heterogeneous Computing Systems

Gentry

Denninnart

Salehi

2019

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

View full text Add to dashboard Cite

In heterogeneous distributed computing (HC) systems, diversity can exist in both computational resources and arriving tasks. In an inconsistently heterogeneous computing system, task types have different execution times on heterogeneous machines. A method is required to map arriving tasks to machines based on machine availability and performance, maximizing the number of tasks meeting deadlines (defined as robustness). For tasks with hard deadlines (e.g., those in live video streaming), tasks that miss their deadlines are dropped. The problem investigated in this research is maximizing the robustness of an oversubscribed HC system. A way to maximize this robustness is to prune (i.e., defer or drop) tasks with low probability of meeting their deadlines to increase the probability of other tasks meeting their deadlines. In this paper, we first provide a mathematical model to estimate a task's probability of meeting its deadline in the presence of task dropping. We then investigate methods for engaging probabilistic dropping and we find thresholds for dropping and deferring. Next, we develop a pruning-aware mapping heuristic and extend it to engender fairness across various task types. We show the cost benefit of using probabilistic pruning in an HC system. Simulation results, harnessing a selection of mapping heuristics, show efficacy of the pruning mechanism in improving robustness (on average by 25%) and cost in an oversubscribed HC system by up to 40%.

show abstract

“…A successful middleware to implement this approach is rCUDA [24], which enables the concurrent remote usage of CUDAenabled devices in a transparent way. An extensive survey for GPU virtualization techniques and scheduling methods is provided in [25]. Although there exist several scheduling methods to schedule job tasks into GPUs, varying from priority-based to load-balancing-based approaches, they perform fine-grained scheduling, being implemented at hypervisor or OS level.…”

Section: Related Workmentioning

confidence: 99%

Optimizing on-demand GPUs in the Cloud for Deep Learning Applications Training

Jahani

Lattuada

Ciavotta

et al. 2019

2019 4th International Conference on Computing, Communications and Security (ICCCS)

View full text Add to dashboard Cite

Deep learning (DL) methods have recently gained popularity and been used in commonplace applications; voice and face recognition, among the others. Despite the growing popularity of DL and the associated hardware acceleration techniques, GPU-based systems still have very high costs. Moreover, while the cloud represents a cost-effective and flexible solution, in large settings operations costs can be further optimized by carefully managing and fostering resource sharing. This work addresses the online joint problem of capacity planning of virtual machines (VMs) and DL training jobs scheduling, and proposes a Mixed Integer Linear Programming (MILP) formulation. In particular, DL jobs are assumed to feature a deadline, while multiple VM types are available from a cloud provider catalog, and each VM has, possibly, multiple GPUs. Our solutions optimize the operations costs by (i) right-sizing the VM capacities; (ii) partitioning the set of GPUs among multiple concurrent jobs running on the same VM, and (iii) determining a deadline-aware job schedule. Our approach is evaluated using an ad-hoc simulator and a prototype environment, and compared against first-principle approaches, resulting in a cost reduction of 45-80%.

show abstract

GPU Virtualization and Scheduling Methods

Cited by 85 publications

References 127 publications

Resource Management in Fog/Edge Computing

Resource Management in Fog/Edge Computing

Robust Dynamic Resource Allocation via Probabilistic Task Pruning in Heterogeneous Computing Systems

Optimizing on-demand GPUs in the Cloud for Deep Learning Applications Training

Contact Info

Product

Resources

About