This paper examines the problem of predicting job runtimes by exploiting the properties of parameter sweeps. A new parameter sweep prediction framework GIPSy (Grid Information Prediction System) is introduced. Predictions are made based on prior runtime information and the parameters used to configure each job. The main objective is providing a tool combining development, simulation and application of prediction models within one framework. The different kinds of available sample selectors and models are discussed in detail. Results are presented for a quantum physics problem. A previously introduced scheduling technique and the implementation called PGS (Prediction based Grid Scheduling) is improved and presented in combination with GIPSy to obtain a real-world grid implementation that optimizes the distribution of parameter sweeps.
Modern data centers use virtualization as a means to increase utilization of increasingly powerful multi-core servers. Applications often require only a fraction of the resources provided by modern hardware. Multiple concurrent workloads are therefore required to achieve adequate utilization levels. Current virtualization solutions allow hardware to be partitioned into Virtual Machines with appropriate isolation on most levels. However, unmanaged consolidation of resource intensive workloads can still lead to unexpected performance variance. Measures are required to avoid or reduce performance interference and provide predictable service levels for all applications. In this paper, we identify and reduce network-related interference effects using performance models based on the runtime characteristics of virtualized workloads. We increase the applicability of existing training data by adding network-related performance metrics and benchmarks. Using the extended set of training data, we predict performance degradation with existing modeling techniques as well as combinations thereof. Application clustering is used to identify several new network-related application types with clearly defined performance profiles. Finally, we validate the added value of the improved models by introducing new scheduling techniques and comparing them to previous efforts. We demonstrate how the inclusion of network-related parameters in performance models can significantly increase the performance of consolidated workloads.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.