A pipelined architecture is proposed in this work to speed up the point multiplication in elliptic curve cryptography (ECC). This is achieved, at first; by pipelining the arithmetic unit to reduce the critical path delay. Second, by reducing the number of clock cycles (latency), which is achieved through careful scheduling of computations involved in point addition and point doubling. These two factors thus, help in reducing the time for one point multiplication computation. On the other hand, the small area overhead for this design gives a higher throughput/area ratio. Consequently, the proposed architecture is synthesised on different FPGAs to compare with the state-of-the-art. The synthesis results over GF(2 m) show that the proposed design can work up to a frequency of 369, 357 and 337 MHz when implemented for m = 163, 233 and 283 bit key lengths, respectively, on Virtex-7 FPGA. The corresponding throughput/slice figures are 42.22, 12.37 and 9.45, which outperform existing implementations.
Applying uni¯ed formula while computing point addition and doubling provides immunity to Elliptic Curve Cryptography (ECC) against power analysis attacks (a type of side channel attack). One of the popular techniques providing this uni¯edness is the Binary Hu® Curves (BHC) which got attention in 2011. In this paper we are presenting highly optimized architectures to implement point multiplication (PM) on the standard NIST curves over GF ð2 163 Þ and GF ð2 233 Þ using BHC. To achieve a high throughput over area ratio,¯rst of all, we have used a simpli¯ed arithmetic and logic unit. Secondly, we have reduced the time to compute PM through Double and Add algorithm. This is achieved by increasing the frequency of operation through a 2-stage pipelined architecture. The increase in clock cycles caused by consequent pipeline hazards is controlled through optimal scheduling of computations involved in PM. The synthesis results show that our designs can work up to a frequency of 377 MHz on Xilinx Virtex 7 FPGA. Moreover, the overall throughput/area ratio achieved through the adopted approach is up to 20% higher while comparing with available state-of-the-art solutions. J CIRCUIT SYST COMP Downloaded from www.worldscientific.com by MONASH UNIVERSITY on 04/22/17. For personal use only. A. R. Jafri et al. 1750178-2 J CIRCUIT SYST COMP Downloaded from www.worldscientific.com by MONASH UNIVERSITY on 04/22/17. For personal use only. Towards an Optimized Architecture for Uni¯ed Binary Hu® Curves 1750178-3 J CIRCUIT SYST COMP Downloaded from www.worldscientific.com by MONASH UNIVERSITY on 04/22/17. For personal use only. A. R. Jafri et al. 1750178-4 J CIRCUIT SYST COMP Downloaded from www.worldscientific.com by MONASH UNIVERSITY on 04/22/17. For personal use only.
Symmetric and asymmetric cryptographic algorithms are used for a secure transmission of data over an unsecured public channel. In order to use these algorithms in real-time applications, many flexible hardware architectures have been proposed and implemented with multiple design constraints. Therefore, a systematic study is required to analyze various implementation approaches. This paper has focused on the identification and classification of recent research practices pertaining to the flexible hardware implementation of cryptographic algorithms. We have used Systematic Literature Review (SLR) process to identify 51 research articles, published during 2008–2017. The identified researches have been classified according to three design approaches: (1) crypto processor, (2) crypto coprocessor and (3) multicore crypto processor. Consequently, a comparative analysis of various cryptographic algorithms in terms of flexibility, throughput, area, power and implementation technology has been presented. A comprehensive investigation of flexible architectures for implementing cryptographic algorithms facilitates researchers and designers of the domain to select an appropriate design approach for a particular algorithm and/or application according to their needs.
This work has proposed a 4-stage pipelined architecture to achieve an optimized throughput over area ratio for point multiplication (PM) computation in binary huff curves (BHC) cryptography. The original mathematical formulation of BHC is revisited with an objective to reduce the required area. Consequently, a simplified formulation of BHC is obtained with 43% reduction in the hardware resources. As far as the throughput is concerned, it is improved first by reducing the critical path and second by minimizing the number of clock cycles (CCs) required to compute one PM. The critical path is reduced through the placement of pipeline registers, whereas the number of required CCs are minimized through an efficient scheduling of computations. These two factors i.e., the area reduction and throughput optimizations, have resulted in maximizing the throughput over area ratio. The proposed pipelined architecture is implemented over [Formula: see text] field, using standard NIST curve parameters. The architecture is modeled in Verilog and synthesized using Xilinx (ISE 14.7) design tool on Virtex 7 FPGA. The implementation results show that 17% improvement in clock frequency, 13% reduction in the time required to compute one PM and 2.6% improvement in throughput/area are achieved when compared with the most recent state of the art solutions.
Universal filtered multicarrier (UFMC) is a low complexity promising waveform that provides quasi-orthogonal property among subcarriers. In addition, it can achieve much better out-of-band emission performance than orthogonal frequency division multiplexing (OFDM) system. Authors have proposed a hardware platform to implement a UFMC transmitter in this paper. Highly reduced complexity schemes for IFFT, filtering, and spectrum shifting are realized on actual hardware. This helps to achieve overall architecture of the transmitter at the cost of minimal FPGA resource usage. Hence, the overall design uses only 1038 slice registers, 1154 slice LUTs, and 64 multipliers of Xilinx Virtex-7 XC7VX330t device. A throughput of 773.5 Msamples/sec at an operational frequency of 364 MHz is achieved. This throughput is adequate for processing 50 Physical Resource Blocks (PRB) of LTE 10 MHz channelization in required time. The presented architecture provides a latency of only 2% of one LTE 10MHz channelization symbol due to the implementation of pipelining at different levels. Although the presented hardware design in its current form meets LTE 10MHz channelization throughput requirements, further increase in throughput is possible due to the scalable nature of the architecture. To the best of our knowledge, this work is first ever FPGA solution for UFMC transmitter presented in the literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.