This paper introduces two tools for manual energy evaluation and runtime tuning developed at IT4Innovations in the READEX project. The MERIC library can be used for manual instrumentation and analysis of any application from the energy and time consumption point of view. Besides tracing, MERIC can also change environment and hardware parameters during the application runtime, which leads to energy savings. MERIC stores large amounts of data, which are difficult to read by a human. The RADAR generator analyses the MERIC output files to find the best settings of evaluated parameters for each instrumented region. It generates a L A T E X report and a MERIC configuration file for application production runs.
Summary
Profiling and tuning of parallel applications is an essential part of HPC. Analysis and elimination of application hot spots can be performed using many available tools, which also provides resource consumption measurements for instrumented parts of the code. Since complex applications show different behavior in each part of the code, it is essential to be able to insert instrumentation to analyse these parts. Because each performance analysis or autotuning tool can bring different insights into an application behavior, it is valuable to analyze and optimize an application using a variety of them. We present our on request inserted shared C/C++ API for the most common open‐source HPC performance analysis tools, which simplify the process of the manual instrumentation. Besides manual instrumentation, profiling libraries provide different methods for instrumentation. Of these, the binary patching is the most universal mechanism, and highly improves the user‐friendliness and robustness of the tool. We provide an overview of the most commonly used binary patching tools, and describe a workflow for how to use them to implement a binary instrumentation tool for any profiler or autotuner. We have also evaluated the minimum overhead of the manual and binary instrumentation.
In this paper we present the ESPRESO FEM library, which includes a FEM toolbox with interfaces to professional and open-source simulation tools, and a massively parallel Hybrid Total FETI (HTFETI) solver which can fully utilize the OLCF Titan supercomputer, and achieves super-linear scaling. This paper presents several new techniques for FETI solvers designed for efficient utilization of supercomputers with a focus on: (i) performance-we present a fivefold reduction of solver runtime for the Laplace equation by redesigning the FETI solver, and offloading the key workload to the accelerator. We compare Intel Xeon Phi 7120p and Tesla K80 and P100 accelerators to Intel Xeon E5-2680v3 and Xeon Phi 7210 CPUs; and (ii) memory efficiency-we present two techniques which increase the efficiency of the HTFETI solver 1.8 times, and pushes the limits of the largest possible problem ESPRESO can solve from 124 to 223 billion unknowns for problems with unstructured meshes. Finally we show that by dynamicly tuning hardware parameters we can reduce energy consumption by up to 33 %.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.