2015
DOI: 10.1145/2872887.2750375
|View full text |Cite
|
Sign up to set email alerts
|

Flexible software profiling of GPU architectures

Abstract: To aid application characterization and architecture design space exploration, researchers and engineers have developed a wide range of tools for CPUs, including simulators, profilers, and binary instrumentation tools. With the advent of GPU computing, GPU manufacturers have developed similar tools leveraging hardware profiling and debugging hooks. To date, these tools are largely limited by the fixed menu of options provided by the tool developer and do not offer the user the flexibility to observe or act on … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 26 publications
(21 citation statements)
references
References 21 publications
0
21
0
Order By: Relevance
“…It allows users to practically assess the impact of errors on GPU applications. SASSIFI is based on the SASSI GPU assembly language instrumentation tool also devel-oped by the NVIDA Architecture Research Group [21]. Although not an official part of the CUDA software toolkit SASSI and SASSIFI are research prototypes which provide a selective instrumentation framework for NVIDIA GPU applications.…”
Section: Gpu Application Error Resilience Testingmentioning
confidence: 99%
“…It allows users to practically assess the impact of errors on GPU applications. SASSIFI is based on the SASSI GPU assembly language instrumentation tool also devel-oped by the NVIDA Architecture Research Group [21]. Although not an official part of the CUDA software toolkit SASSI and SASSIFI are research prototypes which provide a selective instrumentation framework for NVIDIA GPU applications.…”
Section: Gpu Application Error Resilience Testingmentioning
confidence: 99%
“…While non-unrolled vector addition is a simple example, address generation also makes up a significant amount of total dynamic instructions across complex and optimized code. To further illustrate this concept, we instrumented a set of CUDA benchmarks (Section 7.3) using SASSI (Stephenson et al 2015) to generate dynamic instruction execution histograms by instruction PC. In order to allocate integer instructions into the address generation (Agen), control (Control), and compute (Compute_Int, Compute_FP) categories, we further performed a backtrace using the source registers of relevant instructions.…”
Section: Address Generation Overheadsmentioning
confidence: 99%
“…First, modern architectures are giving unprecedented insight into the GPU's hardware activity-for example, NVIDIA Tesla architectures can access over 200 counters and metrics. Furthermore, we expect the trend of increased GPU hardware transparency to continue; for example, new research to create more flexible tools for profiling of GPU hardware events is underway (see SASSI of Stephenson et al [2015]). Yet most power modeling research seeks to learn model parameters from power observations of ∼50 benchmarks.…”
Section: Conclusion and Future Research Directionsmentioning
confidence: 99%
“…are a few questions beginning to surface in the research. Additional research is probing how to apply the power modeling research for an optimal balance of power and performance (e.g., see Jia et al [2015]) and pioneering flexible profiling tools for monitoring GPU processes, (e.g., see Stephenson et al [2015]). Only in the most recent architectures are a large number of the GPU hardware events observable, and how to harness these for accurate understanding of power is thinly addressed.…”
Section: Introductionmentioning
confidence: 99%