2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis 2010
DOI: 10.1109/sc.2010.36
|View full text |Cite
|
Sign up to set email alerts
|

OpenMPC: Extended OpenMP Programming and Tuning for GPUs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
109
0
1

Year Published

2013
2013
2018
2018

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 186 publications
(111 citation statements)
references
References 13 publications
1
109
0
1
Order By: Relevance
“…In contrast, general purpose automatic parallelization compilers for accelerator including GPGPU have been appearing recently, such as OpenACC [20], PGI Accelerator [27], and CAPS HMPP [28]. Moreover, academic suggestions to assist to make up OpenCL and CUDA programs have also been presented [29]- [31]. HiCrypt differs from them in their purpose, which is specialized for symmetric block ciphers.…”
Section: Translation Resultsmentioning
confidence: 99%
“…In contrast, general purpose automatic parallelization compilers for accelerator including GPGPU have been appearing recently, such as OpenACC [20], PGI Accelerator [27], and CAPS HMPP [28]. Moreover, academic suggestions to assist to make up OpenCL and CUDA programs have also been presented [29]- [31]. HiCrypt differs from them in their purpose, which is specialized for symmetric block ciphers.…”
Section: Translation Resultsmentioning
confidence: 99%
“…Other experimental compilation tools like CGCM [16] and PAR4ALL [1] aim at automating the process of CPU-GPU communication and the detection of the pieces of code that can run in parallel. The work by Lee and Eigenmann [20] proposes OpenMPC, an API to facilitate translation of OpenMP programs to CUDA, and a compilation system to support it.…”
Section: Related Workmentioning
confidence: 99%
“…Lee and Vetter evaluate 8 Rodinia benchmarks (out of the 15 we evaluate) and some scientific kernels, such as Jacobi or kernels from the NAS benchmarks. They also evaluate the PGI, CAPS, Open-MPC [20,21], and R-Stream [23] compilers. However, the main difference with this study, is that the work reported here also includes the transformations steps that programmers must follow to transform OpenMP programs into directivebased programs so that these compilers can generate efficient accelerator code.…”
Section: Related Workmentioning
confidence: 99%
“…Several previous studies [1,6,[8][9][10][11] have explored directive-based language extensions and compiler techniques to exploit parallelism using NVIDIA GPUs. We briefly mention a few of them in this section.…”
Section: Related Workmentioning
confidence: 99%
“…Lee and Eigenmann [10] presented an approach of directly translating OpenMP CPU code to GPU code without using language extensions. Compiler analysis finds synchronization points in each parallel region, which can then be split into multiple subregions as necessary for generating multiple CUDA kernels.…”
Section: Related Workmentioning
confidence: 99%