Pilot-Jobs support effective distributed resource utilization, and are arguably one of the most widely-used distributed computing abstractions -as measured by the number and types of applications that use them, as well as the number of production distributed cyberinfrastructures that support them. In spite of broad uptake, there does not exist a well-defined, unifying conceptual model of Pilot-Jobs which can be used to define, compare and contrast different implementations. Often Pilot-Job implementations are strongly coupled to the distributed cyberinfrastructure they were originally designed for. These factors present a barrier to extensibility and interoperability. This paper is an attempt to (i) provide a minimal but complete model (P*) of Pilot-Jobs, (ii) establish the generality of the P* Model by mapping various existing and well known Pilot-Job frameworks such as Condor and DIANE to P*, (iii) derive an interoperable and extensible API for the P* Model (Pilot-API), (iv) validate the implementation of the Pilot-API by concurrently using multiple distinct Pilot-Job frameworks on distinct production distributed cyberinfrastructures, and (v) apply the P* Model to Pilot-Data.
Distributed Computing Infrastructure is characterized by interfaces that are heterogeneous-syntactically and semantically. SAGA represents the most comprehensive community effort to date to address the heterogeneity by defining a simple, uniform access layer. In this paper, we describe the basic concepts underpinning its design and development. We also discuss RADICAL-SAGA which is the most widely used implementation of SAGA.
There are many science applications that require scalable task-level parallelism, support for flexible execution and coupling of ensembles of simulations. Most high-performance system software and middleware, however, are designed to support the execution and optimization of single tasks. Motivated by the missing capabilities of these computing systems and the increasing importance of task-level parallelism, we introduce the Ensemble toolkit which has the following application development features: (i) abstractions that enable the expression of ensembles as primary entities, and (ii) support for ensemble-based execution patterns that capture the majority of application scenarios. Ensemble toolkit uses a scalable pilot-based runtime system that decouples workload execution and resource management details from the expression of the application, and enables the efficient and dynamic execution of ensembles on heterogeneous computing resources. We investigate three execution patterns and characterize the scalability and overhead of Ensemble toolkit for these patterns. We investigate scaling properties for up to O(1000) concurrent ensembles and O(1000) cores and find linear weak and strong scaling behaviour.
Pilot-Jobs (PJ) have become one of the most successful abstractions in distributed computing. In spite of extensive uptake, there does not exist a well defined, unifying conceptual model of Pilot-Jobs, which can be used to define, compare and contrast PJ implementations. This presents a barrier to extensibility and interoperability. This paper is an attempt to, (i) provide a minimal but complete model (P*) of Pilot-Jobs, (ii) establish the generality of the P* Model by mapping various well-known Pilot-Job frameworks such as Condor and DIANE to P*, (iii) demonstrate the interoperable and concurrent usage of distinct pilot-job frameworks on di↵erent production distributed cyberinfrastructures via the use of an extensible API for the P* Model (Pilot-API).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.