2020 IEEE International Symposium on Workload Characterization (IISWC) 2020
DOI: 10.1109/iiswc50251.2020.00033
|View full text |Cite
|
Sign up to set email alerts
|

Accelerating Number Theoretic Transformations for Bootstrappable Homomorphic Encryption on GPUs

Abstract: Homomorphic encryption (HE) draws huge attention as it provides a way of privacy-preserving computations on encrypted messages. Number Theoretic Transform (NTT), a specialized form of Discrete Fourier Transform (DFT) in the finite field of integers, is the key algorithm that enables fast computation on encrypted ciphertexts in HE. Prior works have accelerated NTT and its inverse transformation on a popular parallel processing platform, GPU, by leveraging DFT optimization techniques. However, these GPU-based st… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
29
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 44 publications
(30 citation statements)
references
References 28 publications
1
29
0
Order By: Relevance
“…These show why the use of AVX-512 achieves an insufficient degree of performance improvement and also imply that we can substantially accelerates HE Mul by natively supporting these SIMD instructions. Previous work [49], [53] also showed that SIMD is effective in accelerating NTT and iNTT on CPUs and GPUs. Impact of Q on the characteristics of HE Mul: Q determines multiplicative depth L; a larger depth requires a bigger Q.…”
Section: Discussion and Related Workmentioning
confidence: 94%
See 1 more Smart Citation
“…These show why the use of AVX-512 achieves an insufficient degree of performance improvement and also imply that we can substantially accelerates HE Mul by natively supporting these SIMD instructions. Previous work [49], [53] also showed that SIMD is effective in accelerating NTT and iNTT on CPUs and GPUs. Impact of Q on the characteristics of HE Mul: Q determines multiplicative depth L; a larger depth requires a bigger Q.…”
Section: Discussion and Related Workmentioning
confidence: 94%
“…For NTT and iNTT, [49] characterizes various NTT implementations, including the high-radix approach in this paper, and suggests on-the-fly twiddle factor generation. Another approach [10] is to exploit Discrete Galois Transform (DGT)…”
Section: Discussion and Related Workmentioning
confidence: 99%
“…For the CPU implementation of NTT, we use the same approach as in the RNS operation, where each thread takes N residues (i.e., a (i) for a given i) at a time, and perform N -point NTT. For the GPU implementation, we use the hierarchical NTT implementation [KJPA20], which heavily exploits shared memory in GPUs while adopting an earlier approach in [GLD + 08]. Specifically, for every (i)NTT with N residues, we use 8 per-thread (i)NTT kernels, as described in [KJPA20], where each thread in a kernel loads eight residues into the registers at a time.…”
Section: Basic He Operationsmentioning
confidence: 99%
“…For the GPU implementation, we use the hierarchical NTT implementation [KJPA20], which heavily exploits shared memory in GPUs while adopting an earlier approach in [GLD + 08]. Specifically, for every (i)NTT with N residues, we use 8 per-thread (i)NTT kernels, as described in [KJPA20], where each thread in a kernel loads eight residues into the registers at a time. We launch kernels each performing radix-256 or radix-512 (i)NTT, where radix-k divides an N -point transformation into k interleaved N/k-point transformations.…”
Section: Basic He Operationsmentioning
confidence: 99%
See 1 more Smart Citation