Abstract-The presented work deals an ultra high-speed CMOS 4-2 compressor which is an essential part in fast digital arithmetic integrated circuits. Current-mode techniques have been used to improve the overall performance of the compressor. New fully differential proposed circuit improves delay to less than 37% also reduces occupied area in comparison to other high-speed conventional compressor circuits. To evaluate the performance of the proposed circuit, conventional gate level structure has been chosen and all of the circuits have been simulated in 65-nm IBM CMOS process with 1.2V power supply voltage.Index Terms-Digital logic, 4-2 compressor, CMOS, high speed, current-mode.
I. INTRODUCTIONWith ever-increasing possibilities that VLSI systems provide to realize high-speed digital building blocks, there is a trend toward using digital units to implement processing algorithms even for executing the tasks that were originally analog such as front-end communications. Microprocessors and digital signal processors rely on efficient implementation of fast arithmetic logic units to execute dedicated algorithms such as convolution and filtering [1], [2]. Adders and multipliers are most frequently and widely used arithmetic cells in realizing these processors. In most of these applications, multipliers dictate the overall performance of the system when speed and power consumption are considered as limiting factors. At the circuit design level, there is a great potential for optimization of these building blocks by voltage scaling or application of new CMOS logic styles for the implementation of its embraced combinational circuits [3]. A fast array or tree multiplier is typically composed of three subcircuits: 1) A Booth encoder for the generation of a reduced number of partial products. 2) A carry save structured accumulator for a further reduction of the partial products" matrix to only the addition of two operands. 3) A fast carry propagation adder (CPA) [4] carry representation. Among these subcircuits, the second stage of partial product accumulation, often referred to as the carry save adder (CSA) tree [5]- [7], contributes most to the overall delay and a high fraction of silicon area. Therefore, increasing the speed of CSA subcircuits is crucial to improve the performance of the multiplier. Early designs of CSA tree used the Dadda"s column compression technique [8] with the 3-2 counters, or equivalently the full adders to reduce the partial product matrix. To reduce the delay of the partial product accumulation stage, 4-2 compressors have been widely employed nowadays for high speed multipliers. Because of their regular interconnection, these 4-2 compressors are ideal for the construction of regularly structured Wallace tree with low complexity [7]-[9]. Several 4-2 compressor circuits have been proposed for high-speed applications [3]. In this paper, we begin with a brief introduction of conventional compressors which are composed of two full adders and each full adder optimized in gate level to achieve high spe...