Proceedings of the 13th International Conference on Supercomputing 1999
DOI: 10.1145/305138.305148
|View full text |Cite
|
Sign up to set email alerts
|

Adding a vector unit to a superscalar processor

Abstract: The focus of this paper is on adding a vector unit to a superscalar core, as a way to scale current state of the art superscalar processors.The proposed architecture has a vector register file that shares functional units both with the integer datapath and with the floatingpoint datapath. A key point in our proposal is the design of a high performance cache interface that delivers high bandwidth to the vector unit at a low cost and low latency. We propose a double-banked cache with alignment circuitry to serve… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
36
0

Year Published

2003
2003
2023
2023

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 36 publications
(37 citation statements)
references
References 13 publications
1
36
0
Order By: Relevance
“…Parameters are similar to those found in some recent microprocessors with multimedia extensions like PowerPC970. For VMMX versions a vector cache was used [22]. The vector cache is a twobank interleaved cache targeted at accessing stride-one vector requests by loading two whole cache lines (one per bank) instead of individually loading the vector elements.…”
Section: Memory Hierarchy Modelmentioning
confidence: 99%
“…Parameters are similar to those found in some recent microprocessors with multimedia extensions like PowerPC970. For VMMX versions a vector cache was used [22]. The vector cache is a twobank interleaved cache targeted at accessing stride-one vector requests by loading two whole cache lines (one per bank) instead of individually loading the vector elements.…”
Section: Memory Hierarchy Modelmentioning
confidence: 99%
“…The correctness of the output was verified to ensure no visually perceptible losses in accuracy. Finally, we modified our Jinks simulator [10] to be able to filter the input instruction stream provided by ATOM [15] and correctly simulate the emulated instructions.…”
Section: Emulation Libraries and Code Generationmentioning
confidence: 99%
“…In [10], we studied the design of cost-effective cache hierarchies to leverage high-bandwidth for out-of-order vector processors. In the same way as conventional vector instructions, MOM memory patterns have the potential to allow a smart exploitation of the spatial locality intrinsic in multimedia codes.…”
Section: Cache Hierarchymentioning
confidence: 99%
See 2 more Smart Citations