35th Annual IEEE/ACM International Symposium on Microarchitecture, 2002. (MICRO-35). Proceedings.
DOI: 10.1109/micro.2002.1176244
|View full text |Cite
|
Sign up to set email alerts
|

Effective instruction scheduling techniques for an interleaved cache clustered VLIW processor

Abstract: Clustering is a common

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
19
0

Publication Types

Select...
4
2
2

Relationship

3
5

Authors

Journals

citations
Cited by 14 publications
(19 citation statements)
references
References 20 publications
0
19
0
Order By: Relevance
“…One simply provides the microarchitecture and topology in an abstract form; i.e., location, number, and spatial relationship of microarchitectural resources such as PEs, caches, and register files. While we demonstrated SPS's effectiveness for the TRIPS ISA and microarchitecture, we believe it is applicable to schedulers for other partitioned architectures such as WaveScalar [38], and may be useful for clustered VLIWs [20] and RAW [67,34].…”
Section: Spatial Path Schedulingmentioning
confidence: 97%
“…One simply provides the microarchitecture and topology in an abstract form; i.e., location, number, and spatial relationship of microarchitectural resources such as PEs, caches, and register files. While we demonstrated SPS's effectiveness for the TRIPS ISA and microarchitecture, we believe it is applicable to schedulers for other partitioned architectures such as WaveScalar [38], and may be useful for clustered VLIWs [20] and RAW [67,34].…”
Section: Spatial Path Schedulingmentioning
confidence: 97%
“…Another way to partition the L1 data cache is to distribute a cache line among clusters in a word-interleaved manner [15]. In such a configuration, each cache module will hold some words of each memory block, depending on the data address and the interleaving factor of the architecture.…”
Section: Architecturementioning
confidence: 99%
“…These buses are controlled by the compiler, which is responsible for adding and scheduling an explicit copy operation whenever it assigns two register-flow dependent instructions to different clusters. This paper presents a comparative study of different architecture/compilation techniques that we have recently proposed for fully distributed clustered VLIW processors ( [38], [15], [17]). For each proposed architecture, efficient instruction scheduling techniques are developed, which are strongly tied to the architectural configuration in order to exploit its particularities.…”
Section: Introductionmentioning
confidence: 99%
“…In [23], the authors proposed to distribute the L1 cache among clusters in a cache-coherent manner. In [10], a much simpler design was proposed, in which the L1 data cache is distributed among clusters in a word-interleaved manner. We compare our work to these two distributed cache configurations in Section 5.3.…”
Section: Related Workmentioning
confidence: 99%
“…The cache could be close to one or few clusters but not to all of them. Because of that, some recent works advocate for the distribution of the first level data cache among clusters as well [24][23] [10]. Several configurations have been studied and instruction scheduling techniques have been proposed to exploit the underlying cache architecture.…”
Section: Introductionmentioning
confidence: 99%