While many of the architectural details of future exascale-class high performance computer systems are still a matter of intense research, there appears to be a general consensus that they will be strongly heterogeneous, featuring "standard" as well as "accelerated" resources. Today, such resources are available as multicore processors, graphics processing units (GPUs), and other accelerators such as the Intel Xeon Phi. Any software infrastructure that claims usefulness for such environments must be able to meet their inherent challenges: massive multi-level parallelism, topology, asynchronicity, and abstraction. The "General, Hybrid, and Optimized Sparse Toolkit" (GHOST) is a collection of building blocks that targets algorithms dealing with sparse matrix representations on current and future large-scale systems. It implements the "MPI+X" paradigm, has a pure C interface, and provides hybrid-parallel numerical kernels, intelligent resource management, and truly heterogeneous parallelism for multicore CPUs, Nvidia GPUs, and the Intel Xeon Phi. We describe the details of its design with respect to the challenges posed by modern heterogeneous
Block variants of the Jacobi-Davidson method for computing a few eigenpairs of a large sparse matrix are known to improve the robustness of the standard algorithm, but are generally shunned because the total number of floating-point operations increases. In this paper we present the implementation of a block Jacobi-Davidson solver. By detailed performance engineering and numerical experiments we demonstrate that the increase in operations is typically more than compensated by performance gains on modern architectures, giving a method that is both more efficient and robust than its single vector counterpart.
We first briefly report on the status and recent achievements of the ELPA-AEO (Eigenvalue Solvers for Petaflop Applications -Algorithmic Extensions and Optimizations) and ESSEX II (Equipping Sparse Solvers for Exascale) projects. In both collaboratory efforts, scientists from the application areas, mathematicians, and computer scientists work together to develop and make available efficient highly parallel methods for the solution of eigenvalue problems. Then we focus on a topic addressed in both projects, the use of mixed precision computations to enhance efficiency. We give a more detailed description of our approaches for benefiting from either lower or higher precision in three selected contexts and of the results thus obtained.Keywords ELPA-AEO · ESSEX · eigensolver · parallel · mixed precision IntroductionEigenvalue computations are at the core of simulations in various application areas, including quantum physics and electronic structure computations. Being able to best utilize the capabilities of current and emerging high-end computing systems is essential for further improving such simulations with respect to space/time resolution or by including additional effects in the models. Given these needs, the ELPA-AEO and ESSEX-II projects contribute to the development and implementation of efficient highly parallel methods for eigenvalue problems, in different contexts.Both projects are aimed at adding new features (concerning, e.g., performance and resilience) to previously developed methods and at providing additional functionality with new methods. Building on the results of the first ESSEX funding phase [14,34], ESSEX-II again focuses on iterative methods for very large eigenproblems arising, e.g., in quantum physics. ELPA-AEO's main application area is electronic structure computation, and for these moderately sized eigenproblems direct methods are often superior. Such methods are available in the widely used ELPA library [19], which had originated in an earlier project [2] and is being improved further and extended with ELPA-AEO.In Sections 2 and 3 we briefly report on the current state and on recent achievement in the two projects, with a focus on aspects that may be of particular interest to prospective users of the software or the underlying methods.Mixed precision in the ELPA-AEO and ESSEX-II projects 3In Section 4 we turn to computations involving different precisions. Looking at three examples from the two projects we describe how lower or higher precision is used to reduce the computing time. The ELPA-AEO projectIn the ELPA-AEO project, chemists, mathematicians and computer scientists from the Max Planck Computing and Data Facility in Garching, the Fritz Haber Institute of the Max Planck Society in Berlin, the Technical University of Munich, and the University of Wuppertal collaborate to provide highly scalable methods for solving moderately-sized (n 10 6 ) Hermitian eigenvalue problems. Such problems arise, e.g., in electronic structure computations, and during the earlier ELPA project, efficient...
As we approach the Exascale computing era, disruptive changes in the software landscape are required to tackle the challenges posed by manycore CPUs and accelerators. We discuss the development of a new 'Exascale enabled' sparse solver repository (the ESSR) that addresses these challenges-from fundamental design considerations and development processes to actual implementations of some prototypical iterative schemes for computing eigenvalues of sparse matrices. Key features of the ESSR include holistic performance engineering, tight integration between software layers and mechanisms to mitigate hardware failures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.