2013
DOI: 10.1155/2013/167841
|View full text |Cite
|
Sign up to set email alerts
|

From Physics Model to Results: An Optimizing Framework for Cross-Architecture Code Generation

Abstract: Abstract. Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architectures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multithreading, combining… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
14
0

Year Published

2013
2013
2018
2018

Publication Types

Select...
6
1

Relationship

3
4

Authors

Journals

citations
Cited by 13 publications
(14 citation statements)
references
References 37 publications
0
14
0
Order By: Relevance
“…The framework focuses only on the GPU architecture. Similarly, work in [1] utilises a simple decomposition method with uniform partition where each processor and accelerator receives blocks of the same size. On the other hand, authors in [20] provide a method that allows programmers to partition the data contiguously between CPU and GPU within a single node.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The framework focuses only on the GPU architecture. Similarly, work in [1] utilises a simple decomposition method with uniform partition where each processor and accelerator receives blocks of the same size. On the other hand, authors in [20] provide a method that allows programmers to partition the data contiguously between CPU and GPU within a single node.…”
Section: Related Workmentioning
confidence: 99%
“…for computational fluid dynamics, geometric modelling, solving partial differential equations or image and video processing [1][2][3][4][5]. As computing time and memory usage grow linearly with the number of array elements in stencil computations our research targets highly parallel implementations of stencil codes together with task scheduling and optimization techniques taking into consideration energy cost and data locality [6][7][8][9][10].…”
Section: Introductionmentioning
confidence: 99%
“…Some of the implementations, especially those taking advantage of modern GPUs, have become specific gems in the world of high performance computing [13]. The choice of computational architecture of this kind was not incidental though, as its great potential has already been demonstrated in many other works related to scientific simulations [14,15], databases [16,17] or optimization problems [18]. Historically, the first implementation of the SmithWatermann algorithm using CUDA-capable GPUs was developed by Manavski S. et al [19].…”
Section: Related Workmentioning
confidence: 99%
“…They have been successfully used as accelerators for example in gas and oil industry [7,8], medical imaging [9][10][11], bioinformatics [12][13][14], metaheuristics [15], or stencil-based computations [16,17]. Nevertheless, the primary application of GPUs is still the image and video processing [18][19][20][21].…”
Section: Introductionmentioning
confidence: 99%