Davide Zoni scite author profile

With the recent advances in quantum computing, code-based cryptography is foreseen to be one of the few mathematical solutions to design quantum resistant public-key cryptosystems. The binary polynomial multiplication dominates the computational time of the primitives in such cryptosystems, thus the design of efficient multipliers is crucial to optimize the performance of post-quantum public-key cryptographic solutions. This manuscript presents a flexible template architecture for the hardware implementation of large binary polynomial multipliers. The architecture combines the iterative application of the Karatsuba algorithm, to minimize the number of required partial products, with the Comba algorithm, used to optimize the schedule of their computations. In particular, the proposed multiplier architecture supports operands in the order of dozens of thousands of bits, and it offers a wide range of performance-resources trade-offs that is made independent from the size of the input operands. To demonstrate the effectiveness of our solution, we employed the nine configurations of the LEDAcrypt public-key cryptosystem as representative use cases for large-degree binary polynomial multiplications. For each configuration we showed that our template architecture can deliver a performance-optimized multiplier implementation for each FPGA of the Xilinx Artix-7 mid-range family. The experimental validation performed by implementing our multiplier for all the LEDAcrypt configurations on the Artix-7 12 and 200 FPGAs, i.e., the smallest and the largest devices of the Artix-7 family, demonstrated an average performance gain of 3.6x and 33.3x with respect to an optimized software implementation employing the gf2x C library.

show abstract

Efficient and Scalable FPGA-Oriented Design of QC-LDPC Bit-Flipping Decoders for Post-Quantum Cryptography

Zoni

Galimberti

Fornaciari

2020

IEEE Access

View full text Add to dashboard Cite

Considering code-based cryptography, quasi-cyclic low-density parity-check (QC-LDPC) codes are foreseen as one of the few solutions to design post-quantum cryptosystems. The bit-flipping algorithm is at the core of the decoding procedure of such codes when used to design cryptosystems. An effective design must account for the computational complexity of the decoding and the code size required to ensure the security margin against attacks led by quantum computers. To this end, it is of paramount importance to deliver efficient and flexible hardware implementations to support quantum-resistant publickey cryptosystems, since available software solutions cannot cope with the required performance. This manuscript proposes an efficient and scalable architecture for the implementation of the bit-flipping procedure targeting large QC-LDPC codes for post-quantum cryptography. To demonstrate the effectiveness of our solution, we employed the nine configurations of the LEDAcrypt cryptosystem as representative use cases for QC-LDPC codes suitable for post-quantum cryptography. For each configuration, our template architecture can deliver a performance-optimized decoder implementation for all the FPGAs of the Xilinx Artix-7 mid-range family. The experimental results demonstrate that our optimized architecture allows the implementation of large QC-LDPC codes even on the smallest FPGA of the Xilinx Artix-7 family. Considering the implementation of our decoder on the Xilinx Artix-7 200 FPGA, the experimental results show an average performance speedup of 5 times across all the LEDAcrypt configurations, compared to the official optimized software implementation of the decoder that employs the Intel AVX2 extension. INDEX TERMS QC-LDPC codes, bit-flipping decoding, code-based cryptography, post-quantum cryptography, applied cryptography, FPGA, hardware design

show abstract

Reliable power and time-constraints-aware predictive management of heterogeneous exascale systems

Fornaciari

Agosta

Atienza³

et al. 2018

View full text Add to dashboard Cite

All-Digital Energy-Constrained Controller for General-Purpose Accelerators and CPUs

Zoni

Cremona

Fornaciari

2020

IEEE Embedded Syst. Lett.

View full text Add to dashboard Cite

Considering the energy-cap problem in batterypowered devices, DVFS and power gating represent the defacto state-of-the-art actuators. However, the limited margin available to reduce the operating voltage, the impossibility to massively integrate such actuators on-chip. together with their actuation latency force a revision of such design methodologies. We present an all-digital architecture and a design methodology that can effectively manage the energy-cap problem for CPUs and accelerators. Two quality metrics are put forward to capture the performance loss and the energy budget violations. We employed a vector processor supporting 4 hardware threads as representative usecase. Results show an average performance loss and energy cap violations limited to 2.9% and 3.8%, respectively. Compared to solutions employing the DFS actuator, our alldigital architecture improves the energy-cap violations by 3x while maintaining a similar performance loss.Index Terms-Energy-constrained design, low power, digital design, RTL design, multi-core, power management.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Davide Zoni

PowerTap: All-digital power meter modeling for run-time power monitoring

Flexible and Scalable FPGA-Oriented Design of Multipliers for Large Binary Polynomials

Efficient and Scalable FPGA-Oriented Design of QC-LDPC Bit-Flipping Decoders for Post-Quantum Cryptography

Reliable power and time-constraints-aware predictive management of heterogeneous exascale systems

All-Digital Energy-Constrained Controller for General-Purpose Accelerators and CPUs

Contact Info

Product

Resources

About