Abstract-Instruction memoization is a promising technique to reduce the power consumption and increase the performance of future low-end/mobile multimedia systems. Power and performance efficiency can be improved by reusing instances of an already executed operation. Unfortunately, this technique may not always be worth the effort due to the power consumption and area impact of the tables required to leverage an adequate level of reuse. In this paper, we introduce and evaluate a novel way of understanding multimedia floating-point operations based on the fuzzy computation paradigm: Performance and power consumption can be improved at the cost of small precision losses in computation. By exploiting this implicit characteristic of multimedia applications, we propose a new technique called tolerant memoization. This technique expands the capabilities of classic memoization by associating entries with similar inputs to the same output. We evaluate this new technique by measuring the effect of tolerant memoization for floating-point operations in a low-power multimedia processor and discuss the trade-offs between performance and quality of the media outputs. We report energy improvements of 12 percent for a set of key multimedia applications with small LUT of 6 Kbytes, compared to 3 percent obtained using previously proposed techniques.Index Terms-Low-power design, special-purpose and application-based systems, real-time and embedded systems.
patients with different neurological outcomes. Methods: We studied 49 patients who had suffered a severe TBI and 10 healthy control subjects using 18F-FDG-PET. The patients were divided into three groups: the MCS&VS group (n=17), which included patients who were in a vegetative or a minimally conscious state; the In-PTA group (n=12), which included patients in post-traumatic amnesia (PTA); and the Out-PTA group (n=20), which included patients who had recovered from PTA. SPM5 software was used to determine the metabolic differences between the groups. FDG-PET images were normalized and four regions of interest were generated around the thalamus, precuneus and the frontal and temporal lobes. The groups were parameterized using the Student's T-test. Principal component analysis was used to obtain an intensityestimated-value per subject to correlate the function between the structures.Results: Differences in glucose metabolism in all structures were related to the neurological outcome, and the most severe patients showed the most severe hypometabolism. We also found a significant correlation between the cortico-thalamocortical metabolism in all groups. Conclusions: Voxel-based analysis suggests a functional correlation between these four areas and decreased metabolism was associated with less favorable outcome. Higher levels of activation of the corticocortical connections appear to be related to better neurological conditions. Differences in the thalamo-cortical correlations between patients and controls may be related to traumatic dysfunction due to focal or diffuse lesions.3
Task-based programming models such as OpenMP 5.0 and OmpSs are simple to use and powerful enough to exploit task parallelism of applications over multicore, manycore and heterogeneous systems. However, their software-only runtimes introduce relevant overhead when targeting fine-grained tasks, resulting in performance losses. To overcome this drawback, we present a hardware runtime Picos++ that accelerates critical runtime functions such as task dependence analysis, nested task support, and heterogeneous task scheduling. As a proof-of-concept, the Picos++ hardware runtime has been integrated with a compiler infrastructure that supports parallel task-based programming models. A FPGA SoC running Linux OS has been used to implement the hardware accelerated part of Picos++, integrated with a heterogeneous system composed of 4 symmetric multiprocessor (SMP) cores and several hardware functional accelerators (HwAccs) for task execution. Results show significant improvements on energy and performance compared to state-of-the-art parallel software-only runtimes. With Picos++, applications can achieve up to 7.6x speedup and save up to 90% of energy, when using 4 threads and up to 4 HwAccs, and even reach a speedup of 16x over the software alternative when using 12 HwAccs and small tasks.
OmpSs@FPGA is the flavor of OmpSs that allows offloading application functionality to FPGAs. Similarly to OpenMP, it is based on compiler directives. While the OpenMP specification also includes support for heterogeneous execution, we use OmpSs and OmpSs@FPGA as prototype implementation to develop new ideas for OpenMP. OmpSs@FPGA implements the tasking model with runtime support to automatically exploit all SMP and FPGA resources available in the execution platform. In this paper, we present the OmpSs@FPGA ecosystem, based on the Mercurium compiler and the Nanos++ runtime system. We show how the applications are transformed to run on the SMP cores and the FPGA. The application kernels defined as tasks to be accelerated, using the OmpSs directives are: 1) transformed by the compiler into kernels connected with the proper synchronization and communication ports, 2) extracted to intermediate files, 3) compiled through the FPGA vendor HLS tool, and 4) used to configure the FPGA. Our Nanos++ runtime system schedules the application tasks on the platform, being able to use the SMP cores and the FPGA accelerators at the same time. We present the evaluation of the OmpSs@FPGA environment with the Matrix Multiplication, Cholesky and N-Body benchmarks, showing the internal details of the execution, and the performance obtained on a Zynq Ultrascale+ MPSoC (up to 128x). The source code uses OmpSs@FPGA annotations and different Vivado HLS optimization directives are applied for acceleration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.