High performance computing (HPC) is experiencing a phase change with the challenges of programming and management of heterogeneous multicore systems architectures and large scale system configurations. It is estimated that by the end of the next decade Exaflops computing systems requiring hundreds of millions of cores demanding multi-billionway parallelism with a power budget of 50 Gflops/watt may emerge. At the same time, there are many scaling-challenged applications that although taking many weeks to complete, cannot scale even to a thousand cores using conventional distributed programming models. This paper describes an experimental methodology, ParalleX, that addresses these challenges through a change in the fundamental model of parallel computation from that of the communicating sequential processes (e.g., MPI) to an innovative synthesis of concepts involving message-driven work-queue execution in the context of a global address space. The focus of this work is a new runtime system required to test, validate, and evaluate the use of ParalleX concepts for extreme scalability. This paper describes the ParalleX model and the HPX runtime system and discusses how both strategies contribute to the goal of extreme computing through dynamic asynchronous execution. The paper presents the first early experimental results of tests using a proof-ofconcept runtime-system implementation. These results are very promising and are guiding future work towards a full scale parallel programming and runtime environment. INTRODUCTIONAn important class of parallel applications is emerging as scaling impaired. These are problems that require substantial execution time, sometimes exceeding a month, but which are unable to make effective use of more than a few hundred processors. One such example is numerical relativity used to model colliding neutron stars to simulate gamma ray bursts (GRB) and simultaneously identify the gravitational wave signature for detection with such massive instruments as LIGO (Laser Interferometer Gravitational Observatory). These codes exploit the efficiencies of Adaptive Mesh Refinement (AMR) algorithms to concentrate processing effort at the most active parts of the computation space at any one time. However, conventional parallel programming methods using MPI [1] and systems such as distributed memory MPPs and Linux clusters exhibit poor efficiency and constrained scalability, severely limiting scientific advancement. Many other applications exhibit similar properties. To achieve dramatic improvements for such problems and prepare them for exploitation of Petaflops systems comprising millions of cores, a new execution model and programming methodology is required [2]. This paper briefly presents such a model, ParalleX, and provides early results from an experimental implementation of the HPX runtime system that suggests the future promise of such a computing strategy.It is recognized that technology trends have forced high performance system architectures into the new regime of heterogeneous ...
Core Grid technologies are rapidly maturing, but there remains a shortage of real Grid applications. One important reason is the lack of a simple and high-level application programming toolkit, bridging the gap between existing Grid middleware and application-level needs. The Grid Application Toolkit (GAT), as currently developed by the EC-funded project GridLab [1], provides this missing functionality. As seen from the application, the GAT provides a unified simple programming interface to the Grid infrastructure, tailored to the needs of Grid application programmers and users. A uniform programming interface will be needed for application developers to create a new generation of "Grid-aware" applications. The GAT implementation handles both the complexity and the variety of existing Grid middleware services via so-called adaptors. Complementing existing Grid middleware, GridLab also provides high-level services to implement the GAT functionality.We present the GridLab software architecture, consisting of the GAT, environment-specific adaptors, and GridLab services. We elaborate the concepts underlying the GAT and outline the corresponding application programming interface. We present the functionality of GridLab's high-level services and demonstrate how a dynamic Grid application can easily benefit from the GAT. All GridLab software is open source and can be downloaded from the project Web site.
The new challenges presented by exascale system architectures have resulted in difficulty achieving the desired scalability using traditional distributed-memory runtimes. Asynchronous many-task systems (AMT) are based on a new paradigm showing promise in addressing these challenges, providing application developers with a productive and performant approach to programming on next generation systems. HPX is a C++ Library for concurrency and parallelism that is developed by The STE||AR Group, an international group of collaborators working in the field of distributed and parallel programming (Heller, Diehl, Byerly, Biddiscombe, & Kaiser, 2017; Kaiser et al., n.d.; Tabbal, Anderson, Brodowicz, Kaiser, & Sterling, 2011). It is a runtime system written using modern C++ techniques that are linked as part of an application. HPX exposes extended services and functionalities supporting the implementation of parallel, concurrent, and distributed capabilities for applications in any domain; it has been used in scientific computing, gaming, finances, data mining, and other fields.
Grid technology has matured considerably over the past few years. Progress in both implementation and standardization is reaching a level of robustness that enables production quality deployments of grid services in the academic research community with heightened interest and early adoption in the industrial community. Despite this progress, grid applications are far from ubiquitous, and new applications require an enormous amount of programming effort just to see first light. A key impediment to accelerated deployment of grid applications is the scarcity of high-level application programming abstractions that bridge the gap between existing grid middleware and application-level needs. The Simple API for Grid Applications (SAGA [1]) is a GGF standardization effort that addresses this particular gap by providing a simple, stable, and uniform programming interface that integrates the most common grid programming abstractions. These most common abstractions were identified through the analysis of several existing and emerging Grid applications. In this article, we present the SAGA effort, describe its relationship to other Grid API efforts within the GGF community, and introduce the first draft of the API using some application programming examples.
The significant increase in complexity of Exascale platforms due to energy-constrained, billion-way parallelism, with major changes to processor and memory architecture, requires new energy-efficient and resilient programming techniques that are portable across multiple future generations of machines. We believe that guaranteeing adequate scalability, programmability, performance portability, resilience, and energy efficiency requires a fundamentally new approach, combined with a transition path for existing scientific applications, to fully explore the rewards of todays and tomorrows systems. We present HPX -a parallel runtime system which extends the C++11/14 standard to facilitate distributed operations, enable fine-grained constraint based parallelism, and support runtime adaptive resource management. This provides a widely accepted API enabling programmability, composability and performance portability of user applications. By employing a global address space, we seamlessly augment the standard to apply to a distributed case. We present HPX's architecture, design decisions, and results selected from a diverse set of application runs showing superior performance, scalability, and efficiency over conventional practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.