Proceedings of the Tenth International Symposium on Code Generation and Optimization 2012
DOI: 10.1145/2259016.2259044
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical overlapped tiling

Abstract: This paper introduces hierarchical overlapped tiling, a transformation that applies loop tiling and fusion to conventional loops. Overlapped tiling is a useful transformation to reduce communication overhead, but it may also generate a significant amount of redundant computation. Hierarchical overlapped tiling performs overlapped tiling hierarchically to balance communication overhead and redundant computation, and thus has the potential to provide better performance.In this paper, we describe the hierarchical… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2013
2013
2022
2022

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 64 publications
(33 citation statements)
references
References 35 publications
(38 reference statements)
0
31
0
Order By: Relevance
“…[4] describes the implementation of a page coloring framework in the Linux kernel. Apart from cache partitioning, researchers tried to increase the the shared cache utilization by employing compiler transformations and most commonly loop tiling transformations [6] [7] [5] [13] [14]. [6] describes a method for automatically generating multilevel tiled code for any polyhedral iteration space.…”
Section: Related Workmentioning
confidence: 99%
“…[4] describes the implementation of a page coloring framework in the Linux kernel. Apart from cache partitioning, researchers tried to increase the the shared cache utilization by employing compiler transformations and most commonly loop tiling transformations [6] [7] [5] [13] [14]. [6] describes a method for automatically generating multilevel tiled code for any polyhedral iteration space.…”
Section: Related Workmentioning
confidence: 99%
“…techniques like cache oblivious, time skewing, wavefront or overlapped tiling [1][2][3][4][5][6][7][8][9][10][11][12]38]. In addition, domain-specific compilers have recently been developed for parallel code generation from a stylized stencil specification [39][40][41] or from a code excerpt [42].…”
Section: Related Workmentioning
confidence: 99%
“…Most prior work on optimizing stencil computations focus on lower-order methods (typically p = 2) where there is limited reuse of data and computations are notoriously memory bound. Much of this prior work reduces the amount of data movement by fusing multiple stencil sweeps through techniques like cache oblivious, time skewing, wavefront or overlapped tiling [1][2][3][4][5][6][7][8][9][10][11][12].…”
Section: Introductionmentioning
confidence: 99%
“…(In the iterated stencil computation literature, the redundant regions are often called "ghost zones," and this strategy is sometimes called "overlapped tiling" [17,31].) On a modern x86, this strategy is 10× faster than the breadth-first strategy using the same amount of multithreaded and vector parallelism.…”
Section: Motivation: Scheduling a Two-stage Pipelinementioning
confidence: 99%