2021
DOI: 10.48550/arxiv.2112.14216
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Casper: Accelerating Stencil Computation using Near-cache Processing

Abstract: Stencil computation is one of the most used kernels in a wide variety of scientific applications, ranging from large-scale weather prediction to solving partial differential equations. Stencil computations are characterized by three unique properties: (1) low arithmetic intensity, (2) limited temporal data reuse, and (3) regular and predictable data access pattern. As a result, stencil computations are typically bandwidth-bound workloads, which only experience limited benefits from the deep cache hierarchy of … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
1

Relationship

3
2

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 29 publications
0
5
0
Order By: Relevance
“…PrIM is opensource and publicly available at [168]. Unlike these prior works, DAMOV is applicable to and can be used to study other PIM architectures than processing-in/-near DRAM, including processing-in/-near cache [68,[93][94][95][169][170][171], processing-in/-near storage [40,[172][173][174][175][176][177][178][179][180][181], and processing-in/-near emerging NVMs [81,82,90,91,100,182,183]. This is possible since DAMOV's methodology and benchmarks are mainly concerned with broadly characterizing data movement bottlenecks in an application, independent of the underlying PIM architecture.…”
Section: Discussionmentioning
confidence: 99%
“…PrIM is opensource and publicly available at [168]. Unlike these prior works, DAMOV is applicable to and can be used to study other PIM architectures than processing-in/-near DRAM, including processing-in/-near cache [68,[93][94][95][169][170][171], processing-in/-near storage [40,[172][173][174][175][176][177][178][179][180][181], and processing-in/-near emerging NVMs [81,82,90,91,100,182,183]. This is possible since DAMOV's methodology and benchmarks are mainly concerned with broadly characterizing data movement bottlenecks in an application, independent of the underlying PIM architecture.…”
Section: Discussionmentioning
confidence: 99%
“…Naively employing PIM to accelerate data-intensive workloads can lead to sub-optimal performance due to the many design constraints PIM substrates impose (e.g., limited area and power budget available inside 3D-stacked memories [6] or manufacturing limitations of combining memory and logic elements [6,13]). Therefore, many recent works co-design specialized PIM accelerators and algorithms to improve performance and reduce the energy consumption of (i) applications from various application domains, such as graph processing , machine learning [1,, bioinformatics , high-performance computing [95,[101][102][103][104][105][106][107][108][109][110][111][112], databases [18,19,29,46,60,[113][114][115][116][117][118][119][120][121][122][123][124][125][126][127][128][129][130], security [131][132][133][134][135...…”
Section: Motivation and Problemmentioning
confidence: 99%
“…Though the applications are diverse in scientific computing, analogy to AI, there are several common and performance-critical operations in scientific computing, named Dwarf, defined by the Berkeley View [63]. As one of the seven computational Dwarfs, Stencil is ubiquitously involved in various scientific computing [14], which lies at the heart of thermal diffusion (∼100%), earth system model (>90%), and earthquake prediction model (>90%), etc [34,59,8,40,60,15,12].…”
Section: Introductionmentioning
confidence: 99%