2009
DOI: 10.1002/cpe.1462
|View full text |Cite
|
Sign up to set email alerts
|

Exploiting graphical processing units for data‐parallel scientific applications

Abstract: SUMMARYGraphical processing units (GPUs) have recently attracted attention for scientific applications such as particle simulations. This is partially driven by low commodity pricing of GPUs but also by recent toolkit and library developments that make them more accessible to scientific programmers. We discuss the application of GPU programming to two significantly different paradigms-regular mesh field equations with unusual boundary conditions and graph analysis algorithms. The differing optimization techniq… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
28
0

Year Published

2010
2010
2013
2013

Publication Types

Select...
6
1
1

Relationship

3
5

Authors

Journals

citations
Cited by 37 publications
(33 citation statements)
references
References 46 publications
2
28
0
Order By: Relevance
“…Our experiments have showed that this technique can slightly improve the kernel execution time when processing scale-free graphs with their "fat tail" degree distributions, but the overall execution time, which includes the time required to sort the vertices by the CPU, increases considerably. These findings are in line with previous results reported in [29]. Consequently, the implementation of Kernel 2 reported here does not sort the vertices.…”
Section: Vertex-based Kernelsupporting
confidence: 93%
See 2 more Smart Citations
“…Our experiments have showed that this technique can slightly improve the kernel execution time when processing scale-free graphs with their "fat tail" degree distributions, but the overall execution time, which includes the time required to sort the vertices by the CPU, increases considerably. These findings are in line with previous results reported in [29]. Consequently, the implementation of Kernel 2 reported here does not sort the vertices.…”
Section: Vertex-based Kernelsupporting
confidence: 93%
“…We can also make use of specialised memory types available on the GPU to improve memory access. In our previous work [29] it was shown that the optimal memory type to use for simulating field equations is the texture memory type. The texture memory type uses an on-chip memory cache that caches values from global memory in the spatial locality.…”
Section: Cuda Implementationsmentioning
confidence: 99%
See 1 more Smart Citation
“…Many later studies [17] use that data structure as foundational graph representation. A. Leist et al [18] propose another kind of graph representation which makes some improvement on compact adjacency list. As Fig.…”
Section: B Compact Adjacency Listmentioning
confidence: 99%
“…It is possible to bring parallelism to bear on many problems using hybrids of cluster-computing approaches; accelerator technologies such as general purpose graphical processing unit (GP-GPU); and the use of many threads within a conventional multi-core CPU. These are typified by software technologies such as the open standard Message Passing Interface (MPI) [12], [13]; NVIDIA's Compute Unified Device Architecture (CUDA) [14]- [16] for GPUs; and Intel's Thread Building Blocks (TBB) [17], [18] software for multi-threaded programming multi-core devices, respectively. It is however tedious, error prone and non- trivial for a programmer or even a programming team to implement an application that works well across all these three parallel paradigms or platforms -even for a problems like finite-difference equation solving that have relatively well known solutions.…”
Section: Introductionmentioning
confidence: 99%