Proceedings of the Workshop on Memory Centric High Performance Computing 2018
DOI: 10.1145/3286475.3286482
|View full text |Cite
|
Sign up to set email alerts
|

Data Placement Optimization in GPU Memory Hierarchy using Predictive Modeling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 5 publications
0
1
0
Order By: Relevance
“…Here, n is the number of matrices and dim con-tains the dimensions of the matrices. For instance, for A 20×2 × B 2×30 × C 30×12 × D 12×8 , inputs are n = 4 and dim = [20, 2,30,12].…”
Section: Serial Algorithm By Dynamic Programming Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Here, n is the number of matrices and dim con-tains the dimensions of the matrices. For instance, for A 20×2 × B 2×30 × C 30×12 × D 12×8 , inputs are n = 4 and dim = [20, 2,30,12].…”
Section: Serial Algorithm By Dynamic Programming Methodsmentioning
confidence: 99%
“…They showed that by understanding application I/O patterns and carefully designing data layouts they increased read performance by more than 80%. [12] proposed a machine learning-based approach to build a classifier to determine the best class of GPU memory that will minimize GPU kernel execution time. This approach utilizes a set of performance counters obtained from profiling runs along with hardware features to generate the trained model.…”
Section: Related Workmentioning
confidence: 99%
“…A magnitude of data input means an instruction requires more SP to process data, which explains why the FP-ACO consumes more hardware resources. Research [28][29][30] shows that determining an appropriate parallel model is vital to the performance of the GPU. As shown in Figure 4, we can infer that when a kernel function runs on GPU, that is a process of the corporation of the grids, the blocks, and the threads.…”
Section: Proposed Methodsmentioning
confidence: 99%
“…Similarly to our strategy, the framework LIFT [6], [7] extracts low-level features from an intermediate representation (IR), and then uses a machine learning approach to predict performance based on the extracted code features.…”
Section: Related Workmentioning
confidence: 99%