2021
DOI: 10.48550/arxiv.2102.05297
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Using hardware performance counters to speed up autotuning convergence on GPUs

Abstract: Nowadays, GPU accelerators are commonly used to speed up general-purpose computing tasks on a variety of hardware. However, due to the diversity of GPU architectures and processed data, optimization of codes for a particular type of hardware and specific data characteristics can be extremely challenging. The autotuning of performance-relevant sourcecode parameters allows for automatic optimization of applications and keeps their performance portable. Although the autotuning process typically results in code sp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 32 publications
(78 reference statements)
0
1
0
Order By: Relevance
“…Kernel Tuning Toolkit (KTT) [7] is developed specifically to support online auto-tuning and pipeline tuning, which allows for the exploration of combinations of tunable parameters over multiple kernels. An interesting feature of KTT is the support to keep track of hardware performance counters, such as L2 cache utilization, during benchmarking, which can also be used in advanced search strategies [6]. Auto-Tuning Framework (ATF) [49] implements an innovative way to generate auto-tuning search spaces, for efficient storage and fast exploration of constrained search spaces, but does not focus on introducing new optimization algorithms.…”
Section: A Automated Performance Tuningmentioning
confidence: 99%
“…Kernel Tuning Toolkit (KTT) [7] is developed specifically to support online auto-tuning and pipeline tuning, which allows for the exploration of combinations of tunable parameters over multiple kernels. An interesting feature of KTT is the support to keep track of hardware performance counters, such as L2 cache utilization, during benchmarking, which can also be used in advanced search strategies [6]. Auto-Tuning Framework (ATF) [49] implements an innovative way to generate auto-tuning search spaces, for efficient storage and fast exploration of constrained search spaces, but does not focus on introducing new optimization algorithms.…”
Section: A Automated Performance Tuningmentioning
confidence: 99%