Helena Caminal scite author profile

Database analytic query workloads are heavy consumers of datacenter cycles, and there is constant demand to improve their performance. Associative processors (AP) have re-emerged as an attractive architecture that offers very large data-level parallelism that can be used to implement a wide range of general-purpose operations. Associative processing is based primarily on efficient search and bulk update operations. Analytic query workloads benefit from data parallel execution and often feature both search and bulk update operations. In this paper, we investigate how amenable APs are to improving the performance of analytic query workloads. For this study, we use the recently proposed Content-Addressable Processing Engine (CAPE) framework. CAPE is an AP core that is highly programmable via the RISC-V ISA with standard vector extensions. By mapping key database operators to CAPE and introducing APaware changes to the query optimizer, we show that CAPE is a good match for database analytic workloads. We also propose a set of database-aware microarchitectural changes to CAPE to further improve performance. Overall, CAPE achieves a 10.8× speedup on average (up to 61.1×) on the SSB benchmark (a suite of 13 queries) compared to an iso-area aggressive out-of-order processor with AVX-512 SIMD support.

show abstract

Performance and energy effects on task-based parallelized applications

Caminal

Caballero

Cebrian

et al. 2018

J Supercomput

View full text Add to dashboard Cite

Heterogeneity, parallelization and vectorization are key techniques to improve the performance and energy efficiency of modern computing systems. However, programming and maintaining code for these architectures poses a huge challenge due to the ever-increasing architecture complexity. Furthermore, there has been a swift and unstoppable burst of vector architectures at all market segments, from embedded to HPC. Vectorization can no longer be ignored, but manual vectorization is tedious, error-prone, and not practical for programmers. This work evaluates the feasibility of user-directed vectorization in task-based applications. Our evaluation is based on the OmpSs programming model, extended to support user-directed vectorization for different Intel SIMD architectures (SSE, AVX2, IMCI and AVX-512). Results show that user-directed codes achieve manually-optimized code performance and energy efficiency with minimal code modifications, favoring portability across different SIMD architectures.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Helena Caminal

CAPE: A Content-Addressable Processing Engine

Stencil codes on a vector length agnostic architecture

Using Arm’s scalable vector extension on stencil codes

Accelerating database analytic query workloads using an associative processor

Performance and energy effects on task-based parallelized applications

Contact Info

Product

Resources

About