16th IEEE Symposium on Computer Arithmetic, 2003. Proceedings.
DOI: 10.1109/arith.2003.1207655
|View full text |Cite
|
Sign up to set email alerts
|

Multiple-precision fixed-point vector multiply-accumulator using shared segmentation

Abstract: This paper presents a 64-bit fixed-point vector multiply-accumulator (MAC) architecture capable of supporting multiple precisions. The vector MAC can perform one 64x64, two 32x32, four 16x16 or eight 8x8 bit signed/unsigned multiply-accumulates using essentially the same hardware as a scalar 64-bit MAC and with only a small increase in delay. The scalar MAC architecture is "vectorized" by inserting mode-dependent multiplexing into the partial product generation and by inserting mode-dependent kills in the carr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 24 publications
(6 citation statements)
references
References 15 publications
(18 reference statements)
0
6
0
Order By: Relevance
“…Therefore, employing the least sufficient precision that produces the prescribed solution accuracy can result in higher performance without increasing power consumption. Dynamically configured ALUs according to a level of precision arithmetic were proposed in [19,20,21,22,23,24] to exploit such lower precision arithmetic benefits. In [19], a 64bit multiply accumulator can be configured according to multiple precisions to compute one 64x64, two 32x32, four 16x16, or eight 8x8 unsigned/signed multiply-accumulations using shared segmentation.…”
Section: Exploiting the Increased Parallelism On Fpga And Asicmentioning
confidence: 99%
“…Therefore, employing the least sufficient precision that produces the prescribed solution accuracy can result in higher performance without increasing power consumption. Dynamically configured ALUs according to a level of precision arithmetic were proposed in [19,20,21,22,23,24] to exploit such lower precision arithmetic benefits. In [19], a 64bit multiply accumulator can be configured according to multiple precisions to compute one 64x64, two 32x32, four 16x16, or eight 8x8 unsigned/signed multiply-accumulations using shared segmentation.…”
Section: Exploiting the Increased Parallelism On Fpga And Asicmentioning
confidence: 99%
“…Thus, multi-mode ALUs become more attractive. In [13] , Tan proposed a 64-bit multiply accumulator (MAC) that can compute one 64x64, two32x32, four 16x16, or eight 8x8 unsigned/signed multiplyaccumulations using shared segmentation. On the other hands, Akkas presented architectures for dual mode adders and multipliers in floating-point [14,15] , and Is seven presented a dual-mode floating-point divider [16] that supports two parallel doubleprecision divisions or one quadruple-precision division.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The study of SIMD arithmetic unit starts with fixed-point unit. Many fixed-point optimized subword-parallel hardware structures reducing the area and cycle delay have been developed, such as subword-parallel adders [7], multipleprecision multipliers and multiply-add (MAC) units using booth encoding [8] as well as not using booth encoding [9].…”
Section: Related Workmentioning
confidence: 99%
“…The multiplier in the proposed MAF unit is able to perform either one 53-bit or two parallel 24-bit multiplications. Two methods can be used to design the multiplier, one is booth encoding [8], and the other is array multiplier [9]. Although booth encoding can reduce the number of partial products to half and make the compression tree smaller, it adds the complexity of control logic when handling two precision multiplications, which increases the latency.…”
Section: Multipliermentioning
confidence: 99%