2012
DOI: 10.1007/978-3-642-28789-3_2
|View full text |Cite
|
Sign up to set email alerts
|

Creating and Debugging Performance CUDA C

Abstract: Various practical ways of testing, locating and removing bugs in parallel general-purpose computation on graphics hardware GPGPU applications are described. Some of these are generic whilst other relate directly to stochastic bioinspired techniques, such as genetic programming. We pass on software engineering lessons learnt during CUDA C programming and ways to obtain high performance from nVidia GPU and Tesla cards including examples of both successful and less successful recent applications.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 28 publications
0
2
0
Order By: Relevance
“…Performant parallel programming remains difficult [4]. After several decades of compiler development, it is widely accepted that completely automatic parallelisation using compiler technology is infeasible.…”
Section: Discussionmentioning
confidence: 99%
“…Performant parallel programming remains difficult [4]. After several decades of compiler development, it is widely accepted that completely automatic parallelisation using compiler technology is infeasible.…”
Section: Discussionmentioning
confidence: 99%
“…At what speed will that data be needed by the GPU processing cores? [Langdon, 2012]. Arithmetic intensity is the ratio of instructions performed per data item moved.…”
Section: Applying Gi To a New Gpu Applicationmentioning
confidence: 99%