Chunhua Liao scite author profile

Abstract. OpenMP is a popular and evolving programming model for shared-memory platforms. It relies on compilers for optimal performance and to target modern hardware architectures. A variety of extensible and robust research compilers are key to OpenMP's sustainable success in the future. In this paper, we present our efforts to build an OpenMP 3.0 research compiler for C, C++, and Fortran; using the ROSE source-tosource compiler framework. Our goal is to support OpenMP research for ourselves and others. We have extended ROSE's internal representation to handle all of the OpenMP 3.0 constructs and facilitate their manipulation. Since OpenMP research is often complicated by the tight coupling of the compiler translations and the runtime system, we present a set of rules to define a common OpenMP runtime library (XOMP) on top of multiple runtime libraries. These rules additionally define how to build a set of translations targeting XOMP. Our work demonstrates how to reuse OpenMP translations across different runtime libraries. This work simplifies OpenMP research by decoupling the problematic dependence between the compiler translations and the runtime libraries. We present an evaluation of our work by demonstrating an analysis tool for OpenMP correctness. We also show how XOMP can be defined using both GOMP and Omni and present comparative performance results against other OpenMP compilers.

show abstract

OpenUH: an optimizing, portable OpenMP compiler

Liao

Hernández

Chapman

et al. 2007

Concurrency and Computation

View full text Add to dashboard Cite

SUMMARYOpenMP has gained wide popularity as an API for parallel programming on shared memory and distributed shared memory platforms. Despite its broad availability, there remains a need for a portable, robust, open source, optimizing OpenMP compiler for C/C++/Fortran 90, especially for teaching and research, for example into its use on new target architectures, such as SMPs with chip multi-threading, as well as learning how to translate for clusters of SMPs. In this paper, we present our efforts to design and implement such an OpenMP compiler on top of Open64, an open source compiler framework, by extending its existing analysis and optimization and adopting a source-to-source translator approach where a native back end is not available. The compilation strategy we have adopted and the corresponding runtime support are described. The OpenMP validation suite is used to determine the correctness of the translation. The compiler's behavior is evaluated using benchmark tests from the EPCC microbenchmarks and the NAS parallel benchmark.

show abstract

Early Experiences with the OpenMP Accelerator Model

Liao

Yan

Supinski

et al. 2013

View full text Add to dashboard Cite

Abstract. A recent trend in mainstream computer nodes is the combined use of general-purpose multicore processors and specialized accelerators such as GPUs and DSPs in order to achieve better performance and to reduce power consumption. To support this trend, the OpenMP Language Committee has approved a set of extensions to OpenMP (referred to as the OpenMP accelerator model). The initial version is the subject of Technical Report 1 (TR1) while OpenMP 4.0 Release Candidate 2 (RC2) further refines the extensions. In this paper, we examine the newly released accelerator directives and create an initial reference implementation, referred to as HOMP (Heterogeneous OpenMP). Focused on targeting NVIDIA GPUs, our work is based on an existing OpenMP implementation in the ROSE sourceto-source compiler infrastructure. HOMP includes extensions to parse the new constructs and to represent them in the AST and other compiler translation details. Further we provide initial runtime support. For our evaluation, we have adapted a few existing OpenMP codes to use the accelerator model directives and present preliminary performance results. Finally, we critique the accelerator model in terms of its impact on developers and compiler writers and suggest possible improvements.

show abstract

Auto-tuning full applications: A case study

Tiwari

Hollingsworth

Chen

et al. 2011

The International Journal of High Performance Computing Applica

View full text Add to dashboard Cite

In this paper, we take a concrete step towards materializing our long-term goal of providing a fully automatic end-to-end tuning infrastructure for arbitrary program components and full applications. We describe a general-purpose offline auto-tuning framework and apply it to an application benchmark, SMG2000, a semi-coarsening multigrid on structured grids. We show that the proposed system first extracts computationally-intensive loop nests into separate executable functions, a code transformation called outlining. The outlined loop nests are then tuned by the framework and subsequently integrated back into the application. Each loop nest is optimized through a series of composable code transformations, with the transformations parameterized by unbound optimization parameters that are bound during the tuning process. The values for these parameters are selected using a search-based auto-tuner, which performs a parallel heuristic search for the best-performing optimized variants of the outlined loop nests. We show that our system pinpoints a code variant that performs 2.37 times faster than the original loop nest. When the full application is run using the code variant found by the system, the application's performance improves by 27%.

show abstract

Bridging the gap between deep learning and sparse matrix format selection

Zhao

Liao

et al. 2018

View full text Add to dashboard Cite

Using Polyhedral Analysis to Verify OpenMP Applications are Data Race Free

Ye¹,

Schordan²,

Liao³

et al. 2018

View full text Add to dashboard Cite

Symbolic Analysis of Concurrency Errors in OpenMP Programs

Diersen

Wang

et al. 2013

View full text Add to dashboard Cite

Abstract-In this paper we present the OpenMP Analysis Toolkit (OAT), which uses Satisfiability Modulo Theories (SMT) solver based symbolic analysis to detect data races and deadlocks in OpenMP codes. Our approach approximately simulates real executions of an OpenMP program through schedule permutation. We conducted experiments on realworld OpenMP benchmarks and student homework assignments by comparing our OAT tool with two commercial dynamic analysis tools: Intel Thread Checker and Sun Thread Analyzer, and one commercial static analysis tool: Viva64 PVS Studio. The experiments show that our symbolic analysis approach is more accurate than static analysis and more efficient and scalable than dynamic analysis tools with less false positives and negatives.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Chunhua Liao

Effective Source-to-Source Outlining to Support Whole Program Empirical Optimization

A ROSE-Based OpenMP 3.0 Research Compiler Supporting Multiple Runtime Libraries

OpenUH: an optimizing, portable OpenMP compiler

Early Experiences with the OpenMP Accelerator Model

Auto-tuning full applications: A case study

Bridging the gap between deep learning and sparse matrix format selection

Using Polyhedral Analysis to Verify OpenMP Applications are Data Race Free

Symbolic Analysis of Concurrency Errors in OpenMP Programs

Contact Info

Product

Resources

About