2010 IEEE International Symposium on Parallel &Amp; Distributed Processing (IPDPS) 2010
DOI: 10.1109/ipdps.2010.5470394
|View full text |Cite
|
Sign up to set email alerts
|

Implementing the Himeno benchmark with CUDA on GPU clusters

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
56
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 74 publications
(57 citation statements)
references
References 1 publication
1
56
0
Order By: Relevance
“…Previous implementations of stencil computations on GPUs can be grouped into three categories: (1) Hand-coded implementations of a particular stencil that strive to achieve the best performance possible [17,18,20] -but with optimization techniques that may not generalize to other types of stencils -(2) Implementations where ease of programming is the primary goal rather than performance -often with code generators for various stencils [5,22,14,11] and (3) implementations that focus on a particular parameter and study how tuning it can affect performance [13,16].…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Previous implementations of stencil computations on GPUs can be grouped into three categories: (1) Hand-coded implementations of a particular stencil that strive to achieve the best performance possible [17,18,20] -but with optimization techniques that may not generalize to other types of stencils -(2) Implementations where ease of programming is the primary goal rather than performance -often with code generators for various stencils [5,22,14,11] and (3) implementations that focus on a particular parameter and study how tuning it can affect performance [13,16].…”
Section: Related Workmentioning
confidence: 99%
“…• 19-Point Stencil (Figure 2(c)): This is also called the Himeno benchmark, the behavior of which is detailed elsewhere [20]. We use the same specification (Table I in [20]), except for ignoring the last line of residual calculation.…”
Section: Design Overviewmentioning
confidence: 99%
See 1 more Smart Citation
“…The details of the Himeno computation can be found in [20,21]. Figure 6 shows how Himeno is expressed in HiDP.…”
Section: D Stencil Computationmentioning
confidence: 99%
“…In data parallel applications, it provides a powerful and relatively low cost platform with a potential for significant amount of performance speedup over a traditional CPU approach. CUDA extends C or Fortran by allowing the programmer to define functions, called kernels, that when called are executed on the GPU by potentially thousands of parallel threads [3]. Therefore, there has been an explosion of interest and research in using this platform for high performance computing [4]- [9].…”
Section: A Nvidia Compute Unified Device Architecturementioning
confidence: 99%