The paper presents state of the art of energy-aware high-performance computing (HPC), in particular identification and classification of approaches by system and device types, optimization metrics, and energy/power control methods. System types include single device, clusters, grids, and clouds while considered device types include CPUs, GPUs, multiprocessor, and hybrid systems. Optimization goals include various combinations of metrics such as execution time, energy consumption, and temperature with consideration of imposed power limits. Control methods include scheduling, DVFS/DFS/DCT, power capping with programmatic APIs such as Intel RAPL, NVIDIA NVML, as well as application optimizations, and hybrid methods. We discuss tools and APIs for energy/power management as well as tools and environments for prediction and/or simulation of energy/power consumption in modern HPC systems. Finally, programming examples, i.e., applications and benchmarks used in particular works are discussed. Based on our review, we identified a set of open areas and important up-to-date problems concerning methods and tools for modern HPC systems allowing energy-aware processing.
In the paper we present extensive results from analyzing energy/performance trade-offs with power capping observed on four different modern CPUs, for three different parallel applications such as 2D heat distribution, numerical integration and Fast Fourier Transform. The CPU tested represent both multi-core type CPUs such as Intel R Xeon R E5, desktop and mobile i7 as well as many-core Intel R Xeon Phi TM x200 but also server, desktop and mobile solutions used widely nowadays. We show that using enforced power caps we can find points of lower than default energy consumption but mostly for desktop and mobile solutions at the cost of increased execution time. We show with particular numbers how energy consumed, power consumption and execution time change for the point of minimum energy used versus the default configuration with no power limit, for each application and each tested CPU.
In the article we propose an automatic power capping software tool DEPO that allows one to perform runtime optimization of performance and energy related metrics. For an assumed application model with an initialization phase followed by a running phase with uniform compute and memory intensity, the tool performs automatic tuning engaging one of the two exploration algorithms-linear search (LS) and golden section search (GSS), finds a power cap optimizing a given metric and sets it for the remaining computations. The considered metrics include energy (E), energy-delay sum, energy-delay product. We present experimental results obtained for a set of benchmarks that differ in compute and memory intensity-parallel custom built OpenMP implementations of: numerical integration, heat distribution simulation (HEAT), fast Fourier transform (FFT), and additionally NAS parallel benchmarks: CG, MG, BT, SP, and LU. Tests were performed using multi-core CPUs that are representatives of modern servers and the desktop family: 2 × Intel Xeon E5-2670 v3 CPU (Haswell-EP) and Intel i7-9700K CPU (Coffee Lake). The results show that our approach enabled considerable improvements for the tested metrics, for example, for HEAT and Coffee Lake we minimized energy by 50% at the cost of a 15% increase in execution time (LS), for FFT energy was minimized by 40% at a 25.5% increase in execution time (GSS), for SP and Haswell energy was minimized by 25% at the cost of an 18.5% time increase and for Coffee Lake energy was decreased by 56% with a 12% time increase.
K E Y W O R D Sautomatic power capping, green computing, HPC, performance-energy trade-off, software tools
INTRODUCTIONNowadays, providing high-performance computing (HPC) resources can be expensive, especially when the power required by computing centers exceeds megawatts. Under such circumstances, every method that allows users to decrease power consumption is extremely desirable, and even low energy savings are multiplied by the effects of scale. Thus, new
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.