Metacomputing systems are intended to support remote and/or concurrent use of geographically distributed computational resources. Resource management in such systems is complicated by five concerns that do not typically arise in other situations: site autonomy and heterogeneous substrates at the resources, and application requirements for policy extensibility, co-allocation, and online control.We describe a resource management architecture that addresses these concerns.This architecture distributes the resource management problem among distinct local manager, resource broker, and resource coallocator components and dejines an extensible resource specij?cation language to exchange information about requirements.We describe how these techniques have been implemented in the context of the Globus metacomputing toolkit and used to implement a variety of z T nt resource management strategies. We report on our xperiences applying our techniques in a large testbed, GUSTO, incorporating 15 sites, 330 computers, and 3600 processors.
Some computational grid applications have very large resource r equirements and need simultaneous access to resources from more than one parallel computer. Current scheduling systems do not provide mechanisms to gain such simultaneous access without the help of human administrators of the computer systems. In this work, we propose and evaluate several algorithms for supporting advanced reservation of resources in supercomputing scheduling systems. These advanced reservations allow users to request resources from scheduling systems at speci c times. We nd that the wait times of applications submitted to the queue increases when reservations are s u p p orted and the increase depends on how reservations are supported. Further, we nd that the best performance is achieved when we assume that applications can be terminated and restarted, back lling is performed, and relatively accurate run-time predictions are u s e d.
On many computers, a request to run a job is not serviced immediately but instead is placed in a queue and serviced only when resources are r eleased b y p r eceding jobs. In this paper, we build on run-time prediction techniques that we developed i n p r evious research to explore two problems. The rst problem is to predict how long applications will wait in a queue until they receive resources. We show that run-time estimates can be used for this and that using our run-time estimates result in more a c curate wait-time predictions than when the run-time prediction techniques of other researches are used. The second problem we investigate is improving scheduling performance. We use run-time predictions to improve the performance of the least work rst and back ll scheduling algorithms. We nd that using our run-time predictor results in lower mean wait times for the workloads with higher o ered l o ads when compared to alternative run-time predictors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.