e algebraic reconstruction technique (ART) is an iterative algorithm for CT (i.e., computed tomography) image reconstruction that delivers beer image quality with less radiation dosage than the industry-standard ltered back projection (FBP). However, the high computational cost of ART requires researchers to turn to highperformance computing to accelerate the algorithm. Alas, existing approaches for ART suer from inecient design of compressed data structures and computational kernels on GPUs. us, this paper presents our enhanced CUDA-based CT image reconstruction tool based on the algebraic reconstruction technique (ART) or cuART. It delivers a compression and parallelization solution for ART-based image reconstruction on GPUs. We address the under-performing, but popular, GPU libraries, e.g., cuSPARSE, BRC, and CSR5, on the ART algorithm and propose a symmetrybased CSR format (SCSR) to further compress the CSR data structure and optimize data access for both SpMV and SpMV T via a column-indices permutation. We also propose sorting-based and sorting-free blocking techniques to optimize the kernel computation by leveraging the sparsity paerns of the system matrix. e end result is that cuART can reduce the memory footprint signicantly and enable practical CT datasets to t into a single GPU. e experimental results on a NVIDIA Tesla K80 GPU illustrate that our approach can achieve up to 6.8x, 7.2x, and 5.4x speedups over counterparts that use cuSPARSE, BRC, and CSR5, respectively.