2020
DOI: 10.1109/access.2020.2990418
|View full text |Cite
|
Sign up to set email alerts
|

When Parallel Speedups Hit the Memory Wall

Abstract: After Amdahl's trailblazing work, many other authors proposed analytical speedup models but none have considered the limiting effect of the memory wall. These models exploited aspects such as problem-size variation, memory size, communication overhead, and synchronization overhead, but data-access delays are assumed to be constant. Nevertheless, such delays can vary, for example, according to the number of cores used and the ratio between processor and memory frequencies. Given the large number of possible con… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6

Relationship

3
3

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 31 publications
0
4
0
Order By: Relevance
“…In future work, we intend to evolve this tool to include the ability to predict speedup and efficiency from a few samples using state-of-the-art prediction models present in the literature [26,27]. The idea is to present the general behavior of the program and its scalability trend and reduce the execution time necessary to compose a comprehensive analysis.…”
Section: Discussionmentioning
confidence: 99%
“…In future work, we intend to evolve this tool to include the ability to predict speedup and efficiency from a few samples using state-of-the-art prediction models present in the literature [26,27]. The idea is to present the general behavior of the program and its scalability trend and reduce the execution time necessary to compose a comprehensive analysis.…”
Section: Discussionmentioning
confidence: 99%
“…It is the only parameter that characterizes the application. Indeed, by modelling complex parallel applications using their parallel fraction only, we are neglecting the effects that a frequency change has on the performance of the memory hierarchy, on the parallel overhead, and on the distribution of load across the heterogeneous cores [36]. Besides, as those features should limit the parallel speedup, it is expected that the sequential fraction includes them.…”
Section: Application Performance Modellingmentioning
confidence: 99%
“…Variations of aspects such as memory size and hierarchy, number of cores, and input size may have very diverse effects in parallel software performance [22]. Since the relation between the chunk size of the parallel loops and the total execution time of a program is unknown, using a stochastic optimization method to find an optimal chunk size is an alternative.…”
Section: Csa-based Auto-tuningmentioning
confidence: 99%