Several aspects limit the scalability of parallel applications, e.g., off-chip bus saturation and data synchronization. Moreover, the high cost of cooling HPC systems, which can outweigh the cost of developing the system itself, has pushed the parallel application’s execution to another level of requirements, in terms of performance and energy. In this work, we propose AtTune: a heuristic-based framework for tuning the number of processes/threads and CPU frequency to optimize the parallel applications’ execution. AtTune is transparent for the user, independent of the input size, and it optimizes for different parallel programming models. We evaluated our proposed solution considering five well-known kernels implemented in MPI and OpenMP. Experimental results on two real multi-core systems showed that AtTune improves up to 36%, 11%, and 32% the energy efficiency, performance, and Energy-Delay Product, respectively.
In order to meet the increasing performance demand of applications, the amount of cores in a single chip package has been increasing. However, the heat has been rising at a higher scale, which accelerates the aging process in modern processors. Therefore, wisely balancing the use of resources is important to extend its longevity. Frequency performance stagnates after a certain amount of concurrent threads starts executing. In such cases, the only result is a temperature rise that directly influences the aging process, reducing the processor lifetime. This unbalance between threads can be originated from many factors, which includes the way threads communicate and synchronize. Considering that those characteristics are related to the Parallel Programming Interface (PPI) used to parallelize the application, this work proposes to evaluate three widely used PPIs executing on an embedded multicore. We show that, depending on the characteristic of the application, by only switching from one PPI to another, it is possible to reduce the effects of aging. For that, we have developed a model based on the Arrhenius equation. We show that OpenMP has a lower impact on the processor aging for memory-bound applications: up to 38% and 68% lower than PThreads and MPI, respectively. On the other hand, PThreads presents the lowest impact on the processor aging for CPU-bound applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.