This paper presents an implementation of Rivest, Shamir and Adleman (RSA) cryptosystem based on hardware/software (HW/SW) co-design. The main operation of RSA is the modular exponentiation (ME) which is performed by repeated modular multiplications (MMs). In this work, the right-to-left (R2L) algorithm is used for the implementation of the ME as a programmable system on chip (PSoC). The processor MicroBlaze of Xilinx is used for flexibility. The R2L method is often suggested to improve the timing performance, since it is based on parallel computations of MMs. However, if the optimization of HW resources is a constraint, this method can be executed sequentially using a single modular multiplier as a custom intellectual property (IP). Consequently, the execution time of the ME becomes dependent of three factors, namely the capability of the custom IP to perform the MMs, the nonzero bit string of the exponent and the communication link between the processor and the custom IP. In order to achieve the best trade-off between area, speed and flexibility, we propose three implementations in this work. The first one is a pure software solution. The second one takes benefit of a HW accelerator dedicated to the MM execution. The last one is based on a dual strategy. Two parallel MMs are implemented within a custom IP and local memories are used close to the arithmetic units to minimize the communication link influence. The results show that in the application to RSA 1024-bits, the ME runs in 22,25 ms, while using only 1,848 slices.
In this paper, we present Microblaze-based parallel architectures of Elliptic Curve Scalar Multiplication (ECSM) computation for embedded Elliptic Curve Cryptosystem (ECC) on Xilinx FPGA. The proposed implementations support arbitrary Elliptic Curve (EC) forms defined over large prime field ([Formula: see text]) with different security-level sizes. ECSM is performed using Montgomery Power Ladder (MPL) algorithm in Chudnovsky projective coordinates system. At the low abstraction level, Montgomery Modular Multiplication (MMM) is considered as the critical operation. It is implemented within a hardware Accelerator MMM (AccMMM) core based on the modified high radix, [Formula: see text] MMM algorithm. The efficiency of our parallel implementations is achieved by the combination of the mixed SW/HW approach with Multi Processor System on Programmable Chip (MPSoPC) design. The integration of multi MicroBlaze processor in single architecture allows not only the flexibility of the overall system but also the exploitation of the parallelism in ECSM computation with several degrees. The Virtex-5 parallel implementations of 256-bit and 521-bis ECSM computations run at 100[Formula: see text]MHZ frequency and consume between 2,739 and 6,533 slices, 22 and 72 RAMs and between 16 and 48 DSP48E cores. For the considered security-level sizes, the delays to perform single ECSM are between 115[Formula: see text]ms and 14.72[Formula: see text]ms.
A reconfigurable architecture for efficient computation of several elementary functions, in double precision floating-point format, is presented in this paper. The main idea is to tailor the computation method towards FPGA resources of Virtex-II circuits to increase the execution performances of these functions. Our method employs a piecewise Minimax approximation and look-up tables. To attain a precision of one ULP (Unit in Last Place) without exceeding the memory available in Virtex-II FPGAs, third degree approximation polynomials were needed which have been evaluated using Horner's scheme to reduce the multiplications number. Some strategies were used in the Fused Multiplier Adder (FMAs) to overcome the carry propagation such as the carry save representation for the intermediate results to minimize the multipliers delay.A pipelined architecture implementing our method is proposed and its execution time and area costs estimations are presented showing that the architecture has attained a cycle time of 17.372 ns and an operating frequency of 57 MHz with a latency of four cycles and a throughput of one result per cycle.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.