2012 Innovative Parallel Computing (InPar) 2012
DOI: 10.1109/inpar.2012.6339594
|View full text |Cite
|
Sign up to set email alerts
|

OP2: An active library framework for solving unstructured mesh-based applications on multi-core and many-core architectures

Abstract: OP2 is an "active" library framework for the solution of unstructured mesh-based applications. It utilizes sourceto-source translation and compilation so that a single application code written using the OP2 API can be transformed into different parallel implementations for execution on different back-end hardware platforms. In this paper we present the design of the current OP2 library, and investigate its capabilities in achieving performance portability, near-optimal performance, and scaling on modern multi-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
71
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 78 publications
(72 citation statements)
references
References 16 publications
1
71
0
Order By: Relevance
“…With such an explicit access-descriptor, OP2 allows for optimization and parallel programming experts to choose significantly more radical implementations for very specific hardware in order to gain near-optimal performance. This paper documents a number of significant developments in the design of OP2's heterogeneous back-ends and their performance extending our previous work in [36]: (1) A major contribution is the development of OP2's MPI+OpenMP back-end design and performance which augments the MPI only and MPI+CUDA implementations. This new back-end provides key insights into the performance limiting factors of modern multi-core clusters, particularly demonstrating the issues encountered on NUMA type architectures of multi-core nodes.…”
Section: Related Worksupporting
confidence: 59%
“…With such an explicit access-descriptor, OP2 allows for optimization and parallel programming experts to choose significantly more radical implementations for very specific hardware in order to gain near-optimal performance. This paper documents a number of significant developments in the design of OP2's heterogeneous back-ends and their performance extending our previous work in [36]: (1) A major contribution is the development of OP2's MPI+OpenMP back-end design and performance which augments the MPI only and MPI+CUDA implementations. This new back-end provides key insights into the performance limiting factors of modern multi-core clusters, particularly demonstrating the issues encountered on NUMA type architectures of multi-core nodes.…”
Section: Related Worksupporting
confidence: 59%
“…Later in [13], [14], [16], crucial efforts of evaluating the thread-level performance potentials of PETSc-FUN3D on wide spectrum of architectures are presented. On the other hand, SU2 code of Stanford [59] and OP2 code of Oxford [60] are considered to be the state-of-the-practice unstructured CFD research codes, which both have recently been ported into many emerging HPC architectures [61], [62].…”
Section: Unstructured Aerodynamics Computationsmentioning
confidence: 99%
“…The OP2 library (Mudalige et al (2012)) is a domain specific language embedded in C and Fortran that allows unstructured mesh algorithms to be expressed at a high level, and provides automatic parallelisation and a number of other features. It 20 provides an abstraction that lets the domain scientist describe a mesh using a number of sets (such as quadrilaterals or vertices), connections between these sets (such as edges-to-nodes), and data defined on sets (such as x, y coordinates on vertices).…”
Section: The Op2 Domain Specific Languagementioning
confidence: 99%
“…OP2, by Mudalige et al (2012), is such a DSL, embedded in C/C++ and Fortran; it has been in development since 2009: it provides an abstraction for expressing unstructured mesh computations at a high-level, and then provides automated tools to translate scientific code written once, into a range of high-performance implementations targeting multi-core CPUs, GPUs, and large heterogeneous supercomputers. The original VOLNA model (Dutykh et al (2011)) was already discussed and validated 15 in detail -was used in production for small-scale experiments and modelling, but was inadequate for targeting large-scale scenarios and statistical analysis, therefore it was re-implemented on top of OP2; this paper describes the process, challenges and results from that work.…”
mentioning
confidence: 99%