Energy consumption in multicore embedded systems has become a constant concern. Thread-Level Parallelism exploitation may reduce energy consumption because it saves static power consumption of the processor, since performance is obtained. However, as will be shown in this paper, the influence of the static power on the energy consumption and Energy-Delay Product will depend on how significant it is in the processor. By evaluating different levels of static power in the respect to the total power consumption in two embedded processors (ARM and Atom), we demonstrate that if the right value of static power consumption is tuned during the designing and manufacturing, it is possible to save up 35% in energy consumption and achieve up to 20% of improvements in the EDP efficiency (i.e., the best possible use of the available resources). We also show that the more communication the parallel application has, the lower is the impact of static power of the processor in the total energy consumption.
Abstract. The MPI-2 standard has been implemented for a few years in most of the MPI distributions. As MPI-1.2, it leaves it up to the user to decide when and where the processes must be run. Yet, the dynamic creation of processes, enabled by MPI-2, turns it harder to handle their scheduling manually. This paper presents a scheduler module, that has been implemented with MPI-2, that determines, on-line (i.e. during the execution), on which processor a newly spawned process should be run. The scheduler can apply a basic Round-Robin mechanism or use load information to apply a list scheduling policy, for MPI-2 programs with dynamic creation of processes. A rapid presentation of the scheduler is given, followed by experimental evaluations on three test programs: the Fibonacci computation, the N -Queens benchmark and a computation of prime numbers. Even with the basic mechanisms that have been implemented, a clear gain is obtained regarding the run-time, the load balance, and consequently regarding the number of processes that can be run by the MPI program.
Thread-Level Parallelism (TLP) exploitation for embedded systems has been a challenge for software developers: while it is necessary to take advantage of the availability of multiple cores, it is also mandatory to consume less energy. To speed up the development process and make it as transparent as possible, software designers use Parallel Programming Interfaces (PPIs). However, as will be shown in this paper, each PPI implements different ways to exchange data using shared memory regions, influencing performance, energy consumption and Energy-Delay Product (EDP), which varies across different embedded processors. By evaluating four PPIs and three multicore processors (ARM A8, A9 and Intel Atom), we demonstrate that by simply switching PPI it is possible to save up to 59% in energy consumption and achieve up to 85% of EDP improvements, in the most significant case. We also show that the efficiency (i.e., the best possible use of the available resources) decreases as the number of threads increases in almost all cases, but at distinct rates.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.