Alex F. A. Furtunato scite author profile

Alex F. A. Furtunato

5Publications

7Citation Statements Received

18Citation Statements Given

How they've been cited

How they cite others

Affiliations

Instituto Federal do Rio Grande do Norte

Publications

Order By: Most citations

When Parallel Speedups Hit the Memory Wall

Furtunato¹,

Georgiou

Eder

et al. 2020

IEEE Access

View full text Add to dashboard Cite

After Amdahl's trailblazing work, many other authors proposed analytical speedup models but none have considered the limiting effect of the memory wall. These models exploited aspects such as problem-size variation, memory size, communication overhead, and synchronization overhead, but data-access delays are assumed to be constant. Nevertheless, such delays can vary, for example, according to the number of cores used and the ratio between processor and memory frequencies. Given the large number of possible configurations of operating frequency and number of cores that current architectures can offer, suitable speedup models to describe such variations among these configurations are quite desirable for off-line or on-line scheduling decisions. This work proposes new parallel speedup models that account for variations of the average data-access delay to describe the limiting effect of the memory wall on parallel speedups. Analytical results indicate that the proposed modeling can capture the desired behavior while experimental hardware results validate the former. Additionally, we show that when accounting for parameters that reflect the intrinsic characteristics of the applications, such as degree of parallelism and susceptibility to the memory wall, our proposal has significant advantages over machine-learningbased modeling. Moreover, besides being black-box modeling, our experiments show that conventional machine-learning modeling needs about one order of magnitude more measurements to reach the same level of accuracy achieved in our modeling.

show abstract

Energy-Optimal Configurations for Single-Node HPC Applications

Silva

Furtunato

Georgiou

et al. 2019

View full text Add to dashboard Cite

Application Speedup Characterization

Oliveira

Furtunato

Silveira

et al. 2018

View full text Add to dashboard Cite

The benefits of low operating voltage devices to the energy efficiency of parallel systems

Xavier‐de‐Souza

Neves

Furtunato

et al. 2017

View full text Add to dashboard Cite

Programmable circuits such as general-purpose processors or FPGAs have their end-user energy efficiency strongly dependent on the program that they execute. Ultimately, it is the programmer's ability to code and, in the case of general purpose processors, the compiler's ability to translate source code into a sequence of native instructions that make the circuit deliver the expected performance to the end user. This way, the benefits of energy-efficient circuits build upon energy-efficient devices could be obfuscated by poorly written software. Clearly, having well-written software running on conventional circuits is no better in terms of energy efficiency than having poorly written software running on energy-efficient circuits. Therefore, to get the most out of the energy-saving capabilities of programmable circuits that support low voltage operating modes, it is necessary to address software issues that might work against the benefits of operating in such modes. Multiple software layers are used to abstract away hardware details. Such abstraction enables the solution of complex problems while at the same time the programmer's productivity increases significantly. This comes with a caveat. Software developers often make poor programming choices due to the lack of understanding of how their code affects the usage of the various resources available on the hardware. Poor programming choices leading to a decrease in performance were not an issue since rapid hardware performance improvements were able to compensate for them. More recently, faster switching circuits are much harder to achieve on modern processors with high silicon density due to overheating and quantum effects. Moreover, with the IoT revolution, many systems are depended on limited or unreliable energy sources. These now require the software developer to put a significant effort in writing code that will meet both the energy and performance requirements for a large number of applications. This is a hard challenge as software developers lack the tools that will enable a resource aware software development [1]. To tackle the stagnation of single CPU performance, the microprocessor industry introduced multiprocessing on a single chip. The rationale is simple: if we cannot make them faster, let us make more of them and split the computation among them. With that, gradually, conventional sequential software, including the many layers that compose the , m is giving place to parallel software. However, embracing this solution requires accepting a new challenge: write efficient parallel software. This is significantly harder than creating efficient sequential code. To make things worse, commonly, programmers are not equipped with the skills needed to create parallel software. Addressing the software issues that prevent the benefits of low operating voltage modes reaching the end user might seem then to add up to the challenge of writing good parallel software. Nonetheless, parallel software could as well be an opportunity to employ low operating voltage devic...

show abstract

Energy-Optimal Configurations for Single-Node HPC Applications

Silva¹,

Furtunato²,

Georgiou³

et al. 2018

Preprint

View full text Add to dashboard Cite

Energy efficiency is a growing concern for modern computing, especially for HPC due to operational costs and the environmental impact. We propose a methodology to find energy-optimal frequency and number of active cores to run single-node HPC applications using an application-agnostic power model of the architecture and an architectureaware performance model of the application. We characterize the application performance using Support Vector Regression. The power consumption is estimated by modeling CMOS dynamic and static power without knowledge of the application. The energy-optimal configuration is estimated by minimizing the product of the power model and the performance model's outcomes. Results for four PARSEC applications with five different inputs show that the proposed approach used about 14× less energy when compared to the worst case of the default Linux DVFS governor. For the best case of the DVFS scheme, 23% savings were observed, with an overall average of 6% less energy.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.