GPU-accelerated SART reconstruction using the CUDA programming environment

Keck, Benjamin; Hofmann, Hannes; Scherl, Holger; Kowarschik, Markus; Hornegger, Joachim

doi:10.1117/12.811559

Cited by 38 publications

(38 citation statements)

References 16 publications

(9 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The inputs were 364 1024×768 projections on a half circal trajectory. We reconstructed a 3D volume within the FOV with two different resolution, 256 3 and 512 3 . We tested the different running times for the Single projection method (S), Multiple projection method (M) and Hybrid ordering method (H).…”

Section: Resultsmentioning

confidence: 99%

“…al. [3] used this method in CUDA-based SART. This method is better when cache-miss penalty is larger than memory writing overhead.…”

Section: A Single Projection Methods For Each Projection For Each (Slmentioning

confidence: 99%

“…2) are mapped to thread blocks in CUDA. This volume-to-slabs decomposition has been used in [2][3][4]. Another type of mapping is based on decomposing the volume into horizontal tiles and each tile is mapped to a CUDA thread block, as used in [5], [6].…”

Section: Cone-beam Back-projectionmentioning

confidence: 99%

“…2) to ensure better cache hit-rate. Back-projection implementations [1][2][3][4][5][6] all use texture memory for projection data storage to deal with irregular memory fetching pattern. Using texture memory has several advantages.…”

Section: Cone-beam Back-projectionmentioning

confidence: 99%

See 3 more Smart Citations

Cache-aware GPU memory scheduling scheme for CT back-projection

Zheng

Mueller

2010

IEEE Nuclear Science Symposuim &Amp; Medical Imaging Conference

View full text Add to dashboard Cite

Abstract-Graphic process units (GPUs) are well suited to computing-intensive tasks and are among the fastest solutions to perform Computed Tomography (CT) reconstruction. As previous research shows, the bottleneck of GPU-implementation is not the computational power, but the memory bandwidth. We propose a cache-aware memory-scheduling scheme for the backprojection, which can ensure a better load-balancing between GPU processors and the GPU memory. The proposed reshuffling method can be directly applied on existing GPU-accelerated CT reconstruction pipelines. The experimental results show that our optimization can achieve speedup ranging from 1.18-1.48. Our cache-optimization method is particular effective for lowresolution volumes with high resolution projections.

show abstract

Section: Resultsmentioning

confidence: 99%

“…al. [3] used this method in CUDA-based SART. This method is better when cache-miss penalty is larger than memory writing overhead.…”

Section: A Single Projection Methods For Each Projection For Each (Slmentioning

confidence: 99%

Section: Cone-beam Back-projectionmentioning

confidence: 99%

Section: Cone-beam Back-projectionmentioning

confidence: 99%

See 2 more Smart Citations

Cache-aware GPU memory scheduling scheme for CT back-projection

Zheng

Mueller

2010

IEEE Nuclear Science Symposuim &Amp; Medical Imaging Conference

View full text Add to dashboard Cite

show abstract

“…Equation 1 is solved using Algorithm 1 by alternately minimizing the data consistency term Ax − p 2 and the regularization term R (x) for a fixed number of iterations N ART . In step 3 of Algorithm 1 data consistency is enforced by applying three iterations of the GPU-based Ordered Subsets-ART (OS-ART) method presented in [9]. In step 4 prior knowledge about the reconstructed volume is incorporated by applying operator T to the current volume estimation to reduce the penalty term R(x).…”

Section: End Formentioning

confidence: 99%

Iterative denoising algorithms for perfusion C-arm CT with a rapid scanning protocol

Manhart

Fieselmann

et al. 2013

2013 IEEE 10th International Symposium on Biomedical Imaging

Self Cite

View full text Add to dashboard Cite

Tissue perfusion measurement using C-arm angiography systems capable of CT-like imaging (C-arm CT) is a novel technique with potentially high benefit for catheter-guided treatment of stroke in the interventional suite. New rapid scanning protocols with increased C-arm rotation speed enable fast acquisitions of C-arm CT volumes and allow for sampling the contrast flow with improved temporal resolution. However, the peak contrast attenuation values of brain tissue lie typically in a range of 5-30 HU. Thus perfusion imaging is very sensitive to noise. In this work we compare different denoising algorithms based on the algebraic reconstruction technique (ART) and introduce a novel denoising technique, which requires only iterative filtering in volume space and is computationally much more attractive. Our evaluation using a realistic digital brain phantom shows that all methods improve the perfusion maps perceptibly compared to Feldkamptype (FDK) reconstruction. The volume-based technique performs similarly to the ART-based methods: the Pearson correlation of reference and reconstructed blood flow maps increases from 0.61 for the FDK method to 0.81 for the best ART method and to 0.79 for the volume-based method. Furthermore results from a canine stroke model study are shown.

show abstract