2021
DOI: 10.1364/oe.422266
|View full text |Cite
|
Sign up to set email alerts
|

Out-of-core GPU 2D-shift-FFT algorithm for ultra-high-resolution hologram generation

Abstract: We propose a novel out-of-core GPU algorithm for 2D-Shift-FFT (i.e., 2D-FFT with FFT-shift) to generate ultra-high-resolution holograms. Generating an ultra-high-resolution hologram requires a large complex matrix (e.g., 100K2) with a size that typically exceeds GPU memory. To handle such a large-scale hologram plane with limited GPU memory, we employ a 1D-FFT based 2D-FFT computation method. We transpose the column data to have a continuous memory layout to improve the column-wise 1D-FFT stage performance in … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 23 publications
0
4
0
Order By: Relevance
“…However, only part of the system can be a page-locked region, which limits the problem size that previous algorithms could treat. Lee and others [25] proposed a pinned-memory buffer approach to handle large-scale data that exceeded the maximum size of the page-locked memory. Instead of managing the entire input data in the pinned-memory space, they constructed small-sized pinned-memory buffers were constructed.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…However, only part of the system can be a page-locked region, which limits the problem size that previous algorithms could treat. Lee and others [25] proposed a pinned-memory buffer approach to handle large-scale data that exceeded the maximum size of the page-locked memory. Instead of managing the entire input data in the pinned-memory space, they constructed small-sized pinned-memory buffers were constructed.…”
Section: Related Workmentioning
confidence: 99%
“…We employed the pinned buffer concept of Lee and others [25] and adapted it to 3D-FFT problem. We also present a novel data rearrangement (transposition) method for the 3D-FFT case to efficiently utilize the pinned buffer.…”
Section: Related Workmentioning
confidence: 99%
“…Out-of-core computations on GPUs. The limited memory of GPUs have motivated many studies on how to efficiently access data stored outside the GPU memory for several specific applications such as stencil computations [3], sorting problems [4], large scale graph processing [5] or graphic computations [6]. The solution generally consists in building data blocks that each fit within the GPU memory.…”
Section: Related Workmentioning
confidence: 99%
“…Because high-resolution CGH synthesis has a large amount of computation, it is essential to speed it up, and many studies have been conducted. Lee [13] accelerated FFT by processing FFT-shift and FFT simultaneously using FFT-shift performed forward and backward in FFT. Matsushima [12] proposed an algorithm to speed up occlusion culling by limiting the area calculated by diffracted angles in mesh-based CGH.…”
Section: Introductionmentioning
confidence: 99%