2016 28th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) 2016
DOI: 10.1109/sbac-pad.2016.18
|View full text |Cite
|
Sign up to set email alerts
|

Speeding Up Stencil Computations with Kernel Convolution

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0

Year Published

2019
2019
2019
2019

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 1 publication
0
3
0
Order By: Relevance
“…This example also highlights how the DCMI strategy differs from stencil computing strategies for CPUs and GPUs. One example is ASLI [18], which is similar to DCMI as it creates a new stencil operator that covers multiple time-steps by convolving the operator with itself. Although this approach enables data reuse within a cone, it suffers from the same redundant computation issue as CA, because it does not enable reuse between cones (see Figure 2).…”
Section: Why Does Dcmi Use Minimal Ocm?mentioning
confidence: 99%
See 2 more Smart Citations
“…This example also highlights how the DCMI strategy differs from stencil computing strategies for CPUs and GPUs. One example is ASLI [18], which is similar to DCMI as it creates a new stencil operator that covers multiple time-steps by convolving the operator with itself. Although this approach enables data reuse within a cone, it suffers from the same redundant computation issue as CA, because it does not enable reuse between cones (see Figure 2).…”
Section: Why Does Dcmi Use Minimal Ocm?mentioning
confidence: 99%
“…A number of approaches that optimize ISLs for CPUs and GPUs combine computations from different loop levels to reduce the amount of redundant computation. ASLI [18] is an application-level technique that creates a new stencil operator that covers multiple time-steps by convolving the original stencil operator with itself two or more times. The compiler optimizations loop unrolling (e.g., References [22,45,61]) and forward substitution (e.g., Reference [24]) can be used to achieve similar gains.…”
Section: Cpus Gpus and Asic Acceleratorsmentioning
confidence: 99%
See 1 more Smart Citation