2020
DOI: 10.1007/978-3-030-44534-8_19
|View full text |Cite
|
Sign up to set email alerts
|

High-Level Synthesis in Implementing and Benchmarking Number Theoretic Transform in Lattice-Based Post-Quantum Cryptography Using Software/Hardware Codesign

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 18 publications
(8 citation statements)
references
References 9 publications
0
4
0
Order By: Relevance
“…Based on the master theorem of computational complexity, the time complexity involved in the above multiplication can be given as 58 ) . For faster polynomial multiplication based on NTT, a new hardware architecture is proposed by Nguyen et al 96 The NTT hardware accelerators will be implemented for Kyber and NewHope to improve the overall efficiency. Another hardware accelerator named CARiMoL was introduced by Ishtiaq et al 97 to provide run-time configurability for multiple security levels in CRYSTALS-Kyber and NewHope schemes.…”
Section: Polynomial Multiplicationmentioning
confidence: 99%
“…Based on the master theorem of computational complexity, the time complexity involved in the above multiplication can be given as 58 ) . For faster polynomial multiplication based on NTT, a new hardware architecture is proposed by Nguyen et al 96 The NTT hardware accelerators will be implemented for Kyber and NewHope to improve the overall efficiency. Another hardware accelerator named CARiMoL was introduced by Ishtiaq et al 97 to provide run-time configurability for multiple security levels in CRYSTALS-Kyber and NewHope schemes.…”
Section: Polynomial Multiplicationmentioning
confidence: 99%
“…This way we pipeline and partially unroll the forward elimination inner loop (line 11), increasing its throughput at one row element per clock cycle. To take further advantage of the internal parallelization potential of the Gaussian systemizer, we completely unroll and pipeline the backwards substitution loop (lines [19][20][21][22][23][24][25][26][27]. Its two computational loops (lines 19, 24) are merged, thus decreasing the latency.…”
Section: Hardware/software Co-designmentioning
confidence: 99%
“…A methodology was proposed in [20] for optimizing NTT loops structure, via loop flattening and trip count reduction to optimize the synthesized code via HLS adding directives with various loop expansion approaches. In [21] an NTT HLS implementation is performed using Vivado 2018.3 on a Zynq UltraScale+ MPSoC and show a penalty of 2% to 5% for latency versus an RTL design and in [22] there is comparison between HLS-ready code using design space exploration based on directives vs. HLS block diagram design. Ozcan and Aysu [2] modularized the NTT algorithm and measured that the most computationally intensive part of it is the Butterfly section, which accounts for 78% of all cycles.…”
Section: Number Theoretic Transform (Ntt) a Definitionsmentioning
confidence: 99%