2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2021
DOI: 10.1109/ipdps49936.2021.00116
|View full text |Cite
|
Sign up to set email alerts
|

High-Performance Spectral Element Methods on Field-Programmable Gate Arrays : Implementation, Evaluation, and Future Projection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
11
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 12 publications
(12 citation statements)
references
References 35 publications
1
11
0
Order By: Relevance
“…With regards to total runtime, the computation with rematerialization performs around 2x faster than the original version and requires around 33% fewer cycles to make the computation for inputs larger than 2048 elements. Of note is that we are able to saturate the available memory bandwidth for double-precision much earlier than for single-precision, something that was also noted in our previous work and shown in the appendix to [25].…”
Section: With Rematerializationsupporting
confidence: 78%
See 4 more Smart Citations
“…With regards to total runtime, the computation with rematerialization performs around 2x faster than the original version and requires around 33% fewer cycles to make the computation for inputs larger than 2048 elements. Of note is that we are able to saturate the available memory bandwidth for double-precision much earlier than for single-precision, something that was also noted in our previous work and shown in the appendix to [25].…”
Section: With Rematerializationsupporting
confidence: 78%
“…When designing a custom accelerator, such as the SEM solver accelerator we incept in this work, it is crucial to first derive a performance model for said computation. An analytical performance model can help designing accelerator in multiple ways: (i) we can easier understand the bottlenecks of the application, (ii) we can derive the theoretically observable peak performance and also use it to model future (today non-existing) architecture (e.g., [25,50], and (iii)…”
Section: Theoretical Performance Analysismentioning
confidence: 99%
See 3 more Smart Citations