2020
DOI: 10.1109/tpds.2020.2996314
|View full text |Cite
|
Sign up to set email alerts
|

Correlation of Performance Optimizations and Energy Consumption for Stencil-Based Application on Intel Xeon Scalable Processors

Abstract: This article provides a comprehensive study of the impact of performance optimizations on the energy efficiency of a real-world CFD application called MPDATA, as well as an insightful analysis of performance-energy interaction of these optimizations with the underlying hardware that represents the first generation of Intel Xeon Scalable processors. Considering the MPDATA iterative application as a use case, we explore the fundamentals of energy and performance analysis for a memory-bound application when expos… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 16 publications
(17 citation statements)
references
References 31 publications
0
15
0
Order By: Relevance
“…As a result, the relatively low operational intensity of each MPDATA kernel [4] is not high enough to efficiently utilize the resources of modern processors. In our works [4], [11], [21], [35], [38], a set of optimizations was developed to exploit resources of multicore ccNUMA/SMP systems more efficiently. The resulting parallelization methodology consists of the following parametric optimization steps: [21] -this step explores spatial blocking across the different kernels, employing overlapped tiling with redundant computations, while all kernels are grouped into five packages using loop fusion.…”
Section: Parallelization Methodology For Mpdata Code On Shared Memory Systemsmentioning
confidence: 99%
See 4 more Smart Citations
“…As a result, the relatively low operational intensity of each MPDATA kernel [4] is not high enough to efficiently utilize the resources of modern processors. In our works [4], [11], [21], [35], [38], a set of optimizations was developed to exploit resources of multicore ccNUMA/SMP systems more efficiently. The resulting parallelization methodology consists of the following parametric optimization steps: [21] -this step explores spatial blocking across the different kernels, employing overlapped tiling with redundant computations, while all kernels are grouped into five packages using loop fusion.…”
Section: Parallelization Methodology For Mpdata Code On Shared Memory Systemsmentioning
confidence: 99%
“…Among these systems were 2-socket servers with Intel Xeon CPUs based on Skylake SP, Broadwell, and Haswell architectures. For example, for a platform built with 28-core Intel Platinum 8180 CPUs, the proposed adaptation accelerates the MPDATA application more than 10 times [11] compared to the basic version of code.…”
Section: Parallelization Methodology For Mpdata Code On Shared Memory Systemsmentioning
confidence: 99%
See 3 more Smart Citations