2017
DOI: 10.1007/978-3-319-58667-0_11
|View full text |Cite
|
Sign up to set email alerts
|

gearshifft – The FFT Benchmark Suite for Heterogeneous Platforms

Abstract: Fast Fourier Transforms (FFTs) are exploited in a wide variety of fields ranging from computer science to natural sciences and engineering. With the rising data production bandwidths of modern FFT applications, judging best which algorithmic tool to apply, can be vital to any scientific endeavor. As tailored FFT implementations exist for an ever increasing variety of high performance computer hardware, choosing the best performing FFT implementation has strong implications for future hardware purchase decision… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
7
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 15 publications
(10 citation statements)
references
References 24 publications
2
7
0
Order By: Relevance
“…Figure 2 demonstrates that for small FFTs, Kabuki is less performant than either Hazelhen or Shaheen II, but for FFTs above a size of 512 points, Kabuki is significantly more performant than either Hazelhen or Shaheen II. Vector computers and graphics processing units typically have higher memory bandwidth but also higher latency than typical CPUS found, thus similar results for one dimensional FFT performance are also reported in [30], where for small FFTs, CPU performance is best, but for large FFTs, GPU performance is better. There are many parallel scientific computing programs that have modeling assumptions built into them (for example in computational fluid mechanics, materials science and chemistry).…”
Section: Lessons Learnedsupporting
confidence: 68%
See 2 more Smart Citations
“…Figure 2 demonstrates that for small FFTs, Kabuki is less performant than either Hazelhen or Shaheen II, but for FFTs above a size of 512 points, Kabuki is significantly more performant than either Hazelhen or Shaheen II. Vector computers and graphics processing units typically have higher memory bandwidth but also higher latency than typical CPUS found, thus similar results for one dimensional FFT performance are also reported in [30], where for small FFTs, CPU performance is best, but for large FFTs, GPU performance is better. There are many parallel scientific computing programs that have modeling assumptions built into them (for example in computational fluid mechanics, materials science and chemistry).…”
Section: Lessons Learnedsupporting
confidence: 68%
“…Repeating experiments multiple times has been suggested as a means of verifying reproducibility in benchmarking [16]. Such a methodology has been implemented in gearshifft, a heterogeneous fast Fourier transform benchmark suite [30,36]. In most cases experiments were repeated several times, usually successively, with minor differences between results.…”
Section: Lessons Learnedmentioning
confidence: 99%
See 1 more Smart Citation
“…Fast FFTs on GPUs with CUDA and OpenCL: FFTs on GPUs typically provide up to an order of magnitude advantage over FFTW, 35 particularly if high-end NVIDIA GPUs are used. The pyculib library 36 (formerly Anaconda Accelerate 37 ) provides a python wrapper around the NVIDIA cuFFT Library, 38 allowing parallel computation of FFTs on a GPU.…”
Section: Accelerating the Discrete Fast Fourier Transform: Gpus And/omentioning
confidence: 99%
“…The HPC Challenge benchmark suite [32] which is developed by the University of Tennessee is one of the well-known HPC benchmark suites and is used in many research works [33][34][35]. This suite is composed of several benchmarks, each of which focuses on a particular feature of the HPC clusters such as the ability to do floating-point calculations, the communication speed between nodes, and the potentials of running demanding algorithms such as DFT.…”
Section: Hpc Benchmarks and Resultsmentioning
confidence: 99%