Abstract. Project ExaStencils pursues a radically new approach to stencil-code engineering. Present-day stencil codes are implemented in general-purpose programming languages, such as Fortran, C, or Java, or derivates thereof, and harnesses for parallelism, such as OpenMP, OpenCL or MPI. ExaStencils favors a much more domain-specific approach with languages at several layers of abstraction, the most abstract being the mathematical formulation, the most concrete the optimized target code. At every layer, the corresponding language expresses not only computational directives but also domain knowledge of the problem and platform to be leveraged for optimization. This approach will enable a highly automated code generation at all layers and has been demonstrated successfully before in the U.S. projects FFTW and SPIRAL for certain linear transforms. The Challenges of Exascale ComputingThe performance of supercomputers is on the way from petascale to exascale. Software technology for high-performance computing has been struggling to keep up with the advances in computing power, from terascale in 1996 to petascale in 2009 on to exascale, now being only a factor of 30 away and predicted for the end of the present decade. So far, traditional host languages, such as Fortran and C, being equipped with harnesses for parallelism, such as MPI and OpenMP, have taken most of the burden, and they are being developed further with some new abstractions, notably the partitioned global address space (PGAS) memory model [1] [10]. Yet, the sequential host languages remain generalpurpose: Fortran or C or, if object orientation is desired, C ++ or Java.The step from petascale to exascale performance challenges present-day software technology much more than the advances from gigascale to terascale and terascale to petascale have. The reason is the explicit treatment of the massive parallelism inside one node of a high-performance cluster cannot be avoided any longer. That is, the cluster nodes must be manycores with high numbers of cores. The reorientation of the computer market from single cores to multicores and manycores has been observed with concern [29]. In the high-performance market, the situation is somewhat alleviated by the fact that the additional cycles that large numbers of cores provide are actually being yearned for. But, the question of how to exploit them with efficient and robust software remains.While the potential for massive parallelism on and off the chip is the single most serious challenge to exascale software technology, other challenges take on a high priority and are frequently being mentioned, such as power conservation, fault tolerance and heterogeneity of the execution platform [2]. At best, one would strive for performance portability, i.e., the ability to switch the software with ease from one platform, when it is being decommissioned, to the next, while maintaining highest performance. ExaStencils Application Domain: Stencil CodesStencil codes have extremely high significance and value for a good-sized c...
A standard technique for numerically solving elliptic partial differential equations on structured grids is to discretize them, and, then, to apply an efficient geometric multi-grid solver. Unfortunately, finding the optimal choice of multi-grid components and parameter settings is challenging and existing auto-tuning techniques fail to explain performance-optimal settings. To improve the state of the art, we explore whether recent work on optimizing configurations of product lines can be applied to the stencil-code domain. In particular, we extend the domain-independent tool SPL Conqueror in an empirical study to predict the performance-optimal configurations of three geometric multi-grid stencil codes: a program using HIPAcc, the evaluation prototype HSMGP, and a program using DUNE. For HIPAcc, we reach an prediction accuracy of 96%, on average, measuring only 21.4% of all configurations; we predict a configuration that is nearly optimal after measuring less than 0.3% of all configurations. For HSMGP, we predict performance with an accuracy of 97% including the performance-optimal configuration, while measuring 3.2% of all configurations. For DUNE, we predict performance of all configurations with an accuracy of 86% after measuring 3.3% of all configurations. The performance-optimal configuration is within the 0.5% configurations predicted to perform best.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.