Proceedings of the 29th ACM on International Conference on Supercomputing 2015
DOI: 10.1145/2751205.2751245
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Energy Efficient Parallelization of Uniform Dependence Computations

Abstract: Energy is now a critical concern in all aspects of computing. We address a class of programs that includes the so-called "stencil computations" that have already been optimized for speed. We target the energy expended in dynamic memory accesses, since most other components of the total energy are usually already reduced when optimizing for speed alone. For a standard shared memory multi-core processor, we seek to minimize the total number of off-chip memory accesses without sacrificing execution time. Our stra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2016
2016
2018
2018

Publication Types

Select...
2
1

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 31 publications
(54 reference statements)
0
4
0
Order By: Relevance
“…We also provide variability of both speed and cache behavior with respect to tile size. Some prior work on code generators for various types of tiling have empirically compared iteration space tiling and cache-oblivious methods [2,11,44,58,73]. The objective of these experiments are slightly different from ours; the main target of evaluation is the code generation tools.…”
Section: Empirical Study Of Cache-oblivious Methodsmentioning
confidence: 99%
“…We also provide variability of both speed and cache behavior with respect to tile size. Some prior work on code generators for various types of tiling have empirically compared iteration space tiling and cache-oblivious methods [2,11,44,58,73]. The objective of these experiments are slightly different from ours; the main target of evaluation is the code generation tools.…”
Section: Empirical Study Of Cache-oblivious Methodsmentioning
confidence: 99%
“…Lastly, our runtime scheduling policy mandates that each processor must follow a strictly lexicographically ascending order within the block and finish all of the work within a block before being preempted. Such policy essentially guarantees a multi-pass execution of the iteration space, which was previously proven [57] to exhibit energy efficiency but was only applicable to stencil kernels.…”
Section: Experimental Evaluationmentioning
confidence: 99%
“…As many authors note, such static control structures have a number of drawbacks [3,7,8,16,30,57]. First, they induce unnecessary synchronization-any tile of wavefront w must wait for all tiles of wavefront w − 1.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation