2018 IEEE International Conference on Computational Science and Engineering (CSE) 2018
DOI: 10.1109/cse.2018.00026
|View full text |Cite
|
Sign up to set email alerts
|

HSTREAM: A Directive-Based Language Extension for Heterogeneous Stream Computing

Abstract: Big data streaming applications require utilization of heterogeneous parallel computing systems, which may comprise multiple multi-core CPUs and many-core accelerating devices such as NVIDIA GPUs and Intel Xeon Phis. Programming such systems require advanced knowledge of several hardware architectures and device-specific programming models, including OpenMP and CUDA. In this paper, we present HSTREAM, a compiler directive-based language extension to support programming stream computing applications for heterog… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 18 publications
0
5
0
Order By: Relevance
“…In general, as long as the capacity of the cache/shared memory is not exceeded, the larger the tile size, the better the data locality. As shown in Figure 3, the sizes (16,32) clause of the tile directive indicates that the matrices A, B, and C are split into (M∕16 + 1) × (N∕32 + 1) tiles respectively and the two nested for-loops are transformed into four nested for-loops.…”
Section: Loop Optimizationmentioning
confidence: 99%
See 1 more Smart Citation
“…In general, as long as the capacity of the cache/shared memory is not exceeded, the larger the tile size, the better the data locality. As shown in Figure 3, the sizes (16,32) clause of the tile directive indicates that the matrices A, B, and C are split into (M∕16 + 1) × (N∕32 + 1) tiles respectively and the two nested for-loops are transformed into four nested for-loops.…”
Section: Loop Optimizationmentioning
confidence: 99%
“…OmpSs 31 is a task‐based parallel programming model composed of a set of directives and library routines, which enables the effective parallelization of applications across multiple heterogeneous devices (such as GPUs and FPGAs). HSTREAM 32 is a high‐level parallel programming model based on OpenMP‐like compiler directives, which enables programmers to easily develop stream computing applications that can be cooperatively performed on both multi‐core CPUs and accelerators. AIRA 33 is a programming framework that supports the flexible execution of compute kernels written using standard OpenMP directives and clauses on heterogeneous CPU‐GPU platforms.…”
Section: Related Workmentioning
confidence: 99%
“…Other works look at performance optimization for numerical solvers [38], sparse matrix vector multiplication [39], [40], and dynamic stochastic economic models [39]. Ferrão et al [41] and Memeti et al [42] develop a stream processing framework for XeonPhi to increase the programming productivity. The runtime can automatically distribute workloads across CPUs and accelerating devices.…”
Section: Domain-specific Optimizationsmentioning
confidence: 99%
“…As presented in [68], many powerful HPC systems are heterogeneous, in the sense that they combine general-purpose CPUs with accelerators such as, Graphics Processing Units (GPUs), or Field Programable Gate Arrays (FPGAs) [69]. Several HPC approaches exist [70,71,72,73] developed to improve the performance of advanced and data intensive modeling and simulation applications. The parallel computing paradigm may be used on multicore CPUs, many-core processing units (such as, GPUs [74]), re-configurable hardware platforms (such as FPGAs), or over distributed infrastructure (such as, cluster, Grid, or Cloud).…”
Section: Task Parallelization and High-performance Computingmentioning
confidence: 99%