CRYSTALS-Dilithium is a lattice-based signature scheme to be standardized by NIST as the primary post-quantum signature algorithm. In this work, we make a thorough study of optimizing the implementations of Dilithium by utilizing the Advanced Vector Extension (AVX) instructions, specifically AVX2 and the latest AVX512.We first present an improved parallel small polynomial multiplication with tailored early evaluation (PSPM-TEE) to further speed up the signing procedure, which results in a speedup of 5%-6% compared with the original PSPM Dilithium implementation. We then present a tailored reduction method that is simpler and faster than Montgomery reduction. Our optimized AVX2 implementation exhibits a speedup of 3%-8% compared with the state-of-the-art of Dilithium AVX2 software. Finally, for the first time, we propose a fully and highly vectorized implementation of Dilithium using AVX-512. This is achieved by carefully vectorizing most of Dilithium functions with the AVX512 instructions in order to improve efficiency both for time and for space simultaneously.With all the optimization efforts, our AVX-512 implementation improves the performance by 37.3%/50.7%/39.7% in key generation, 34.1%/37.1%/42.7% in signing, and 38.1%/38.7%/40.7% in verification for the parameter sets of Dilithium2/3/5 respectively. To the best of our knowledge, our AVX512 implementation has the best performance for Dilithium on the Intel x64 CPU platform to date.
In this work, we make systematic optimizations of key encapsulation mechanisms (KEM) based on module learning-with-errors (MLWE), covering algorithmic design, fundamental operation of number-theoretic transform (NTT), approaches to expanding encapsulated key size, and optimized implementation coding. We focus on Kyber (now in the Round-3 finalist of NIST PQC standardization) and Aigis (a variant of Kyber proposed at PKC 2020). By careful analysis, we first observe that the algorithmic design of Kyber and Aigis can be optimized by the mechanism of asymmetric key consensus with noise (AKCN) proposed in [12,13]. Specifically, the decryption process can be simplified with AKCN, leading to a both faster and less error-prone decryption process. Moreover, the AKCN-based optimized version has perfect compatibility with the deployment of Kyber/Aigis in reality, as they can run on the same parameters, the same public key, and the same encryption process. We make a systematic study of the variants of NTT proposed in recent years for extending its applicability scope, make concrete analysis of their exact computational complexity, and in particular show their equivalence. We then present a new variant named hybrid-NTT (H-NTT), combining the advantages of existing NTT methods, and derive its optimality in computational complexity. The H-NTT technique not only has larger applicability scope but also allows for modular and unified implementation codes of NTT operations even with varying module dimensions. We analyze and compare the different approaches to expand the size of key to be encapsulated (specifically, 512-bit key for dimension of 1024), and conclude with the most economic approach. To mitigate the compatibility issue in implementations we adopt the proposed H-NTT method. Each of the above optimization techniques is of independent value, and we apply all of them to Kyber and Aigis, resulting in new protocol variants named OSKR and OKAI respectively. For all the new protocol variants proposed in this work, we provide both AVX2 and ARM Cortex-M4 implementations, and present the performance benchmarks. Through thorough implementation optimizations, our AVX2 implementation gains efficiency improvement by 17.39% compared to Kyber-512, by 11.31% to Kyber-768, and by 34.26% to Kyber-1024. Meanwhile, our work shows 53.96%, 25.00%, and 49.08% improvement in speed and 82.57% reduction in pre-computed root storage compared to Aigis. Also, to the best of our knowledge, our work is the first that presents ARM Cortex-M4 implementations for the variants of Aigis.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.