2009 International Conference on High Performance Computing (HiPC) 2009
DOI: 10.1109/hipc.2009.5433179
|View full text |Cite
|
Sign up to set email alerts
|

A performance prediction model for the CUDA GPGPU platform

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
44
0
2

Year Published

2012
2012
2021
2021

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 79 publications
(48 citation statements)
references
References 17 publications
2
44
0
2
Order By: Relevance
“…Zhang and Owens [14] adopted a microbenchmark-based approach to develop a throughput model for three major components of GPU execution time: instruction pipeline, shared memory access, and global memory access. Their model focuses on identifying performance bottlenecks and guiding programmers for optimization; our model focuses on predicting the execution time, which is similar to [15]- [17]. Baghsorkhi et al [15] presented a compiler-based GPU performance modeling approach with accurate prediction using program analysis and symbolic evaluation techniques.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Zhang and Owens [14] adopted a microbenchmark-based approach to develop a throughput model for three major components of GPU execution time: instruction pipeline, shared memory access, and global memory access. Their model focuses on identifying performance bottlenecks and guiding programmers for optimization; our model focuses on predicting the execution time, which is similar to [15]- [17]. Baghsorkhi et al [15] presented a compiler-based GPU performance modeling approach with accurate prediction using program analysis and symbolic evaluation techniques.…”
Section: Related Workmentioning
confidence: 99%
“…Their model estimates the number of parallel memory requests by taking into account the number of running threads and memory bandwidth. Kothapalli et al [17] presented a performance model by combining several known models of parallel computation: BSP, PRAM, and QRQW. However, their proposed analytical models are based on the abstraction of GPU architecture.…”
Section: Related Workmentioning
confidence: 99%
“…Kothapalli et al [20] have presented a combination of known models with small extensions. The models they have used are: BSP model, PRAM model by Fortune and Wylie [21] and the QRQW model by Gibbons [2].…”
Section: It Is Common To Present the Parameters Of The Bsp Model As Amentioning
confidence: 99%
“…However, algorithms developed on the models do not always show good performance on GPUs because the PRAM models are substantially different from actual GPU architectures. For estimating the performance of GPUbased algorithms, several models have been proposed [5,6,7,8]. Hong et al [9] and Kothapalli et al [5] have proposed analytical models to estimate actual running time of GPU-based algorithms without executing applications on GPUs.…”
Section: Introductionmentioning
confidence: 99%
“…For estimating the performance of GPUbased algorithms, several models have been proposed [5,6,7,8]. Hong et al [9] and Kothapalli et al [5] have proposed analytical models to estimate actual running time of GPU-based algorithms without executing applications on GPUs. Ma et al [7] and Nakano [8] have proposed memory access models that take memory access latency into account.…”
Section: Introductionmentioning
confidence: 99%