2010 IEEE International Symposium on Parallel &Amp; Distributed Processing, Workshops and PHD Forum (IPDPSW) 2010
DOI: 10.1109/ipdpsw.2010.5470682
|View full text |Cite
|
Sign up to set email alerts
|

Performance modeling of heterogeneous systems

Abstract: As the complexity of parallel computers grows, constraints posed by the construction of larger systems require both greater, and increasingly non-linear, parameter sets to model their behavior realistically. These heterogeneous characteristics create a trade-off between the complexity and accuracy of performance models, creating challenges in utilizing them for design decisions.In this thesis, we take a bottom-up approach to realistically model software and hardware interactions, by composing system models fro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
7
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 64 publications
(91 reference statements)
0
7
0
Order By: Relevance
“…Much work have also been done on performance modeling for heterogeneous systems,() including approaches using machine learning and performance counters. Wu et al predicted how kernels scale when GPU hardware properties change, based on performance counter values, using a machine learning model.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Much work have also been done on performance modeling for heterogeneous systems,() including approaches using machine learning and performance counters. Wu et al predicted how kernels scale when GPU hardware properties change, based on performance counter values, using a machine learning model.…”
Section: Related Workmentioning
confidence: 99%
“…In the following, we will simplify the analysis by restricting ourselves to ImaceCL‐like applications, as described in Section 6. These applications follow the bulk synchronous parallel (BSP) model,() consisting of stages of parallel computation on local data, interleaved with global communication and synchronization. Our method is, however, not tied to ImageCL and can also be generalized to other models than BSP.…”
Section: Performance Modelingmentioning
confidence: 99%
“…Furthermore, analytical performance models for GPUs and heterogeneous systems have been developed [42][43][44][45] and used for auto-tuning. 5 Much work has been done on creating machine learning-based models and using them for tuning and auto-tuning, eg, to determine loop unroll factors, 28,46 which optimizations to apply for parallel stencil computations, 7,47 Message Passing Interface (MPI) parameters, 14 and general compiler optimizations.…”
Section: Related Workmentioning
confidence: 99%
“…Khan et al used a script‐based auto‐tuning compiler to translate sequential C loop nests to parallel CUDA code. Furthermore, analytical performance models for GPUs and heterogeneous systems have been developed and used for auto‐tuning …”
Section: Related Workmentioning
confidence: 99%
“…for stencil computations [7], matrix multiplication [8] and FFTs [9]. Furthermore, analytical performance models for GPUs and heterogeneous systems have been developed [10,11,12,13] and used for auto-tuning [14].…”
Section: Related Workmentioning
confidence: 99%