After Amdahl's trailblazing work, many other authors proposed analytical speedup models but none have considered the limiting effect of the memory wall. These models exploited aspects such as problem-size variation, memory size, communication overhead, and synchronization overhead, but data-access delays are assumed to be constant. Nevertheless, such delays can vary, for example, according to the number of cores used and the ratio between processor and memory frequencies. Given the large number of possible configurations of operating frequency and number of cores that current architectures can offer, suitable speedup models to describe such variations among these configurations are quite desirable for off-line or on-line scheduling decisions. This work proposes new parallel speedup models that account for variations of the average data-access delay to describe the limiting effect of the memory wall on parallel speedups. Analytical results indicate that the proposed modeling can capture the desired behavior while experimental hardware results validate the former. Additionally, we show that when accounting for parameters that reflect the intrinsic characteristics of the applications, such as degree of parallelism and susceptibility to the memory wall, our proposal has significant advantages over machine-learningbased modeling. Moreover, besides being black-box modeling, our experiments show that conventional machine-learning modeling needs about one order of magnitude more measurements to reach the same level of accuracy achieved in our modeling.
No abstract
Programmable circuits such as general-purpose processors or FPGAs have their end-user energy efficiency strongly dependent on the program that they execute. Ultimately, it is the programmer's ability to code and, in the case of general purpose processors, the compiler's ability to translate source code into a sequence of native instructions that make the circuit deliver the expected performance to the end user. This way, the benefits of energy-efficient circuits build upon energy-efficient devices could be obfuscated by poorly written software. Clearly, having well-written software running on conventional circuits is no better in terms of energy efficiency than having poorly written software running on energy-efficient circuits. Therefore, to get the most out of the energy-saving capabilities of programmable circuits that support low voltage operating modes, it is necessary to address software issues that might work against the benefits of operating in such modes. Multiple software layers are used to abstract away hardware details. Such abstraction enables the solution of complex problems while at the same time the programmer's productivity increases significantly. This comes with a caveat. Software developers often make poor programming choices due to the lack of understanding of how their code affects the usage of the various resources available on the hardware. Poor programming choices leading to a decrease in performance were not an issue since rapid hardware performance improvements were able to compensate for them. More recently, faster switching circuits are much harder to achieve on modern processors with high silicon density due to overheating and quantum effects. Moreover, with the IoT revolution, many systems are depended on limited or unreliable energy sources. These now require the software developer to put a significant effort in writing code that will meet both the energy and performance requirements for a large number of applications. This is a hard challenge as software developers lack the tools that will enable a resource aware software development [1]. To tackle the stagnation of single CPU performance, the microprocessor industry introduced multiprocessing on a single chip. The rationale is simple: if we cannot make them faster, let us make more of them and split the computation among them. With that, gradually, conventional sequential software, including the many layers that compose the , m is giving place to parallel software. However, embracing this solution requires accepting a new challenge: write efficient parallel software. This is significantly harder than creating efficient sequential code. To make things worse, commonly, programmers are not equipped with the skills needed to create parallel software. Addressing the software issues that prevent the benefits of low operating voltage modes reaching the end user might seem then to add up to the challenge of writing good parallel software. Nonetheless, parallel software could as well be an opportunity to employ low operating voltage devic...
Energy efficiency is a growing concern for modern computing, especially for HPC due to operational costs and the environmental impact. We propose a methodology to find energy-optimal frequency and number of active cores to run single-node HPC applications using an application-agnostic power model of the architecture and an architectureaware performance model of the application. We characterize the application performance using Support Vector Regression. The power consumption is estimated by modeling CMOS dynamic and static power without knowledge of the application. The energy-optimal configuration is estimated by minimizing the product of the power model and the performance model's outcomes. Results for four PARSEC applications with five different inputs show that the proposed approach used about 14× less energy when compared to the worst case of the default Linux DVFS governor. For the best case of the DVFS scheme, 23% savings were observed, with an overall average of 6% less energy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.