Summary
Concurrent execution of tasks in GPUs can reduce the computation time of a workload by overlapping data transfer and execution commands. However, it is difficult to implement an efficient runtime scheduler that minimizes the workload makespan as many execution orderings should be evaluated. In this paper, we employ scheduling theory to build a model that takes into account the device capabilities, workload characteristics, constraints, and objective functions. In our model, GPU tasks scheduling is reformulated as a flow shop scheduling problem, which allow us to apply and compare well‐known heuristics already developed in the operations research field. In addition, we develop a new heuristic, specifically focused on executing GPU commands, that achieves better scheduling results than previous ones. It leverages on a precise GPU command execution model for both computation and data transfers to carry out more advantageous scheduling decisions. A comprehensive evaluation, showing the suitability and robustness of this new approach, is conducted in three different NVIDIA architectures (Kepler, Maxwell, and Pascal). Results confirm the proposed heuristic achieves the best results in more than 90% of the experiments. Furthermore, a comparison has been made with MPS (Multi‐Process Service), the NVIDIA API that deals with the execution of concurrent tasks, which shows that our solution obtains speed‐ups ranging from 1.15 to 1.20.
Concurrent execution of tasks in GPUs can reduce the computation time of a workload by overlapping data transfer and execution commands. However it is difficult to implement an efficient runtime scheduler that minimizes the workload makespan as many execution orderings should be evaluated. In this paper, we employ scheduling theory to build a model that takes into account the device capabilities, workload characteristics, constraints and objective functions. In our model, GPU tasks scheduling is reformulated as a flow shop scheduling problem, which allow us to apply and compare well known methods already developed in the operations research field. In addition we develop a new heuristic, specifically focused on executing GPU commands, that achieves better scheduling results than previous techniques. Finally, a comprehensive evaluation, showing the suitability and robustness of this new approach, is conducted in three different NVIDIA architectures (Kepler, Maxwell and Pascal).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.