2020
DOI: 10.1109/tpds.2020.3004623
|View full text |Cite
|
Sign up to set email alerts
|

GPGPU Performance Estimation With Core and Memory Frequency Scaling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
28
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 36 publications
(29 citation statements)
references
References 43 publications
1
28
0
Order By: Relevance
“…When the core clock frequency is low enough for the application to be compute bound the execution time becomes dominated by computations and there is a direct dependency of the execution time on the core clock frequency. How the device memory latency is hidden and how it changes with the ratio of core and memory clock frequency is described in [40]. This is also supported by the analysis of the performance counters from NVVP which shows that an increase in the execution time at a particular critical frequency is due to the saturation of the number of issued instructions (see Fig.…”
Section: Discussionmentioning
confidence: 77%
See 1 more Smart Citation
“…When the core clock frequency is low enough for the application to be compute bound the execution time becomes dominated by computations and there is a direct dependency of the execution time on the core clock frequency. How the device memory latency is hidden and how it changes with the ratio of core and memory clock frequency is described in [40]. This is also supported by the analysis of the performance counters from NVVP which shows that an increase in the execution time at a particular critical frequency is due to the saturation of the number of issued instructions (see Fig.…”
Section: Discussionmentioning
confidence: 77%
“…The undervolting on GPUs was also explored by Mendes et al [43], where authors have achieved lower energy consumption without performance degradation and in some cases with better performance. Wang and Chu [40] introduce a fine-grained analytical model for estimation of the execution time of different GPU kernels. They have also investigated the memory latency and its dependency on core and memory frequency.…”
Section: Related Workmentioning
confidence: 99%
“…By following a similar approach Wang et al [6] proposed a DVFS-aware GPU performance model. The authors estimated the GPU architecture parameters using a collection of microbenchmarks and a group a performance counters, measured during their execution.…”
Section: Related Workmentioning
confidence: 99%
“…However, an efficient use of energy management techniques, such as DVFS, requires accurate models that can predict how the energy consumption changes with the GPU operating frequencies (and voltages). This type of modeling is often done by separately modeling the performance and the power consumption of the GPU, focusing on how each one separately scales with DVFS [6], [7]. On the other hand, several previous works have shown that the performance/power behavior of GPU applications considerably vary with the application characteristics [8], [9], which makes these predictive models to require some information from the application to provide accurate predictions.…”
Section: Introductionmentioning
confidence: 99%
“…Hong and Kim present a simple analytical GPU model to estimate the execution time of GPU kernels, based on estimating the number of parallel memory requests, by considering the number of running threads and memory bandwidth. Wang and Chu provide an improved GPU performance estimation technique that also takes core and memory frequency scaling into account. Unfortunately, their model parameters were determined using microbenchmarks, which have become obsolete for newer generations of GPUs .…”
Section: Related Workmentioning
confidence: 99%