Scalable Montgomery Modular Multiplication Architecture with Low-Latency and Low-Memory Bandwidth Requirement

Lin, Wen-Ching; Ye, Jheng-Hao; Shieh, Ming-Der

doi:10.1109/tc.2012.218

Cited by 17 publications

(4 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Ref. [25] reorganizes the operands on the basis of [19] to achieve low memory bandwidth and high frequency while keeping the number of iterations unchanged, but its delay chain under high input bit width contains two-stage multiplication and addition modules, so the balance between frequency and total number of cycles cannot be achieved. Although [11] also uses the full-carry-save method, it uses CPA to complete the data conversion in the iterative process, thus reducing the overall running frequency.…”

Section: Results and Comparisonsmentioning

confidence: 99%

A Scalable Montgomery Modular Multiplication Architecture with Low Area-Time Product Based on Redundant Binary Representation

Zhaoji

Zhang

2022

Electronics

View full text Add to dashboard Cite

The Montgomery modular multiplication is an integral operation unit in the public key cryptographic algorithm system. Previous work achieved good performance at low input widths by combining Redundant Binary Representation (RBR) with Montgomery modular multiplication, but it is difficult to strike a good balance between area and time as input bit widths increase. To solve this problem, based on the redundant Montgomery modular multiplication, in this paper, we propose a flexible and pipeline hardware implementation of the Montgomery modular multiplication. Our proposed structure guarantees a single-cycle delay between two-stage pipeline units and reduces the length of the critical path by redistributing the data paths between the pipelines and preprocessing the input in the loop. By analyzing the structure and comparing the related work in this paper, our structure ensures a lower area-time product while achieving a controllable and small area consumption. The comprehensive results under different Taiwan Semiconductor Manufacturing Company (TSMC) processes demonstrate the advantages of our structure in terms of flexibility and area-time product.

show abstract

Section: Results and Comparisonsmentioning

confidence: 99%

A Scalable Montgomery Modular Multiplication Architecture with Low Area-Time Product Based on Redundant Binary Representation

Zhaoji

Zhang

2022

Electronics

View full text Add to dashboard Cite

show abstract

“…In the recent study, the dependency graph and multiple process elements (PEs) are the research hotspots in the Montgomery modular multiplication algorithm. According to the algorithm, Lin et al [5,13,14] proposed a hardware architecture consisting of multiple PEs to work in parallel for reducing the delay and the memory bandwidth requirement, and achieving higher throughput. Renardy et al [15] designed an iterative modular architecture on FPGA and achieved better 2 AT (area delay).…”

Section: Related Workmentioning

confidence: 99%

The Novel Efficient Dual-field FIPS Modular Multiplication

2020

KSII TIIS

View full text Add to dashboard Cite

The modular multiplication is the key module of public-key cryptosystems such as RSA (Rivest-Shamir-Adleman) and ECC (Elliptic Curve Cryptography). However, the efficiency of the modular multiplication, especially the modular square, is very low. In order to reduce their operation cycles and power consumption, and improve the efficiency of the public-key cryptosystems, a dual-field efficient FIPS (Finely Integrated Product Scanning) modular multiplication algorithm is proposed. The algorithm makes a full use of the correlation of the data in the case of equal operands so as to avoid some redundant operations. The experimental results show that the operation speed of the modular square is increased by 23.8% compared to the traditional algorithm after the multiplication and addition operations are reduced about 2 () / 2 s s − , and the read operations are reduced about 2 s s − , where 32 s = n / for n-bit operands. In addition, since the algorithm supports the length scalable and dual-field modular multiplication, distinct applications focused on performance or cost could be satisfied by adjusting the relevant parameters.

show abstract

“…The multiplier component implemented by them have increased the throughput by using single cell that comprised of digital multiplier and adder circuits. In (Lin, Ye, & Shieh, 2014) a new technique that relaxed data dependency in load-based algorithms and tried to reuse the referred word of a variable to realize the implementation of numerous Montgomery modular multiplication algorithms namely high-radix algorithm. It claims 54% cut down of power utility without any degradation in speed.…”

Section: Montgomery Multiplicationmentioning

confidence: 99%

Bit Forwarding 3-Bits Technique for Efficient Modular Exponentiation

Vollala

Begum

Joshi

et al. 2017

International Journal of Information Security and Privacy

View full text Add to dashboard Cite

It is widely recognized that the public-key cryptosystems are playing tremendously an important role for providing the security services. In majority of the cryptosystems the crucial arithmetic operation is modular exponentiation. It is composed of a series of modular multiplications. Hence, the performance of any cryptosystem is strongly depends on the efficient implementation of these operations. This paper presents the Bit Forwarding 3-bits(BFW3) technique for efficient implementation of modular exponentiation. The modular multiplication involved in BFW3 is evaluated with the help of Montgomery method. These techniques improves the performance by reducing the frequency of modular multiplications. Results shows that the BFW3 technique is able to reduce the frequency of multiplications by 18.20% for 1024-bit exponent. This reduction resulted in increased throughput of 18.11% in comparison with MME42_C2 at the cost of 1.09% extra area. The power consumption reduced by 8.53% thereby saving the energy up to 10.10%.

show abstract

Scalable Montgomery Modular Multiplication Architecture with Low-Latency and Low-Memory Bandwidth Requirement

Cited by 17 publications

References 16 publications

A Scalable Montgomery Modular Multiplication Architecture with Low Area-Time Product Based on Redundant Binary Representation

A Scalable Montgomery Modular Multiplication Architecture with Low Area-Time Product Based on Redundant Binary Representation

The Novel Efficient Dual-field FIPS Modular Multiplication

Bit Forwarding 3-Bits Technique for Efficient Modular Exponentiation

Contact Info

Product

Resources

About