The need to support various digital signal processing (DSP)and classification applications on energy-constrained devices has steadily grown. Such applications often extensively perform matrix multiplications using fixed-point arithmetic while exhibiting tolerance for some computational errors. Hence, improving the energy efficiency of multiplications is critical. In this paper, we offer a similar speed, but with energy efficiency. The method is to collect the armature close to the closest momentum of two. An integral part of the computer, so the multiplication is eliminated, improving the speed and power consumption at a small error value. The proposed approach is to apply both signed and neglected. We offer three hardware implementations of an approximate multiplier that includes not being signed and signed for both operations. The effectiveness of this proposed multiplier is estimated by comparing its effectiveness with certain approximate and real-world by using different design parameters. In addition, the effect of the proposed approximate multipliers is examined in two applications for image processing, namely sharpness of the image. Keywords-Approximate multiplier, Energy efficiency and Power consumption ,integrated circuits, DSP I. BACKGROUND Energy minimization is one of the main design requirements in almost any electronic systems, especially the portable ones such as smart phones, tablets, and different gadgets [1]. It is highly desired to achieve this minimization with minimal performance (speed) penalty [1]. Therefore, improving the speed and power/energy-efficiency characteristics of multipliers plays a key role in improving the efficiency of processors. Many of the DSP cores implement image and video processing algorithms where final outputs are either images or videos prepared for human consumptions. This fact enables us to use approximations for improving the speed/energy efficiency. This originates from the limited perceptual abilities of human beings in observing an image or a video. In addition to the image and video processing applications, there are other areas where the exactness of the arithmetic operations is not critical to the functionality of the system (see [3], [4]). Being able to use the approximate computing provides the designer with the ability of making tradeoffs between the accuracy and the speed as well as power/energy consumption [2], [5]. Applying the approximation to the arithmetic units can be performed at different design abstraction levels including circuit, logic, and architecture levels, as well as algorithm and software layers [2]. The approximation may be performed
In this paper, we present a carry skip adder (CSKA) structure that has a higher speed yet lower energy consumption compared with the conventional one. The speed enhancement is achieved by applying concatenation and incrementation schemes to improve the efficiency of the conventional CSKA (Conv-CSKA) structure. In addition, instead of utilizing multiplexer logic, the proposed structure makes use of AND-OR-Invert (AOI) and OR-AND-Invert (OAI) compound gates for the skip logic. The structure may be realized with both fixed stage size and variable stage size styles, wherein the latter further improves the speed and energy parameters of the adder. Finally, a hybrid variable latency extension of the proposed structure, which lowers the power consumption without considerably impacting the speed, is presented. This extension utilizes a modified parallel structure for increasing the slack time, and hence, enabling further voltage reduction. The proposed structures are assessed by comparing their speed, power, and energy parameters with those of other adders using a 45-nm static CMOS technology for a wide range of supply voltages. The results that are obtained using HSPICE simulations reveal, on average, 44% and 38% improvements in the delay and energy, respectively, compared with those of the Conv-CSKA. In addition, the power-delay product was the lowest among the structures considered in this paper, while its energy-delay product was almost the same as that of the Kogge-Stone parallel prefix adder with considerably smaller area and power consumption. Simulations on the proposed hybrid variable latency CSKA reveal reduction in the power consumption compared with the latest works in this field while having a reasonably high speed.Index Terms-Carry skip adder (CSKA), energy efficient, high performance, hybrid variable latency adders, voltage scaling.
Cryptographic pairings are important primitives for many advanced cryptosystems. Efficient computation of pairings requires the use of several layers of algorithms as well as optimizations in different algorithm and implementation levels. This makes implementing cryptographic pairings a difficult task particularly in hardware. Many existing hardware implementations fix the parameters of the pairing to improve efficiency but this significantly limits the generality and practicality of the solution. In this paper, we present a compact and programmable yet high-performance architecture for programmable system-onchip platforms designed for efficient computation of different cryptographic pairings. We demonstrate with real hardware that this architecture can compute optimal ate pairings on a Barreto-Naehrig curve with 126-bit security in 2.18 ms in a Xilinx Zynq-7020 device and occupies only about 3200 slices, 36 DSPs, and 18 BRAMs. We also show that the architecture can support different types of pairings via microcode updates and can be implemented on other reprogrammable devices with very minor modifications.
In this paper, we propose a modified carry select adder (CSLA) structure which is more power/energy and area-efficient compared to the existing CSLAs. The higher efficiency is achieved by modifying the logic formulations of the carry generation and selection (CGS) scheme and merging all of its redundant logic operations in the carry generation (CG) and carry selection (CS) units of the CGS-based CSLA (CGS-CSLA) structure. This leads to a simplified structure. Next, the proposed CSLA structure is employed to design efficient square-root CSLA (SQRT-CSLA) structure. The efficiency of the proposed SQRT-CSLA is investigated by comparing its speed, power, energy, and area parameters with those of some other SQRT-CSLA structures, including conventional SQRT-CSLA, binary to excess-1 converter, common Boolean logic, and the CGS-based SQRT-CSLA structures. The investigation, which is performed using HSPICE simulations based on a 45-nm bulk CMOS technology, includes 8, 16, 32, and 64-bit adder structures. The impact of voltage scaling on the efficiency of the proposed structure is also studied by changing the supply voltage levels from the near-threshold voltage to the nominal supply voltage. Simulation results reveal that the proposed SQRT-CSLA provides, at least, 14%, 14%, and 15% lower energy, energydelay product, and area-delay product, respectively, compared to those of the CGS-based SQRT-CSLA structure, averaged over the supply voltage and bit length.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.