2008
DOI: 10.1145/1379023.1375678
|View full text |Cite
|
Sign up to set email alerts
|

Placement-and-routing-based register allocation for coarse-grained reconfigurable arrays

Abstract: DSP architectures often feature multiple register files with sparse connections to a large set of ALUs. For such DSPs, traditional register allocation algorithms suffer from a lot of problems, including a lack of retargetability and phase-ordering problems. This paper studies alternative register allocation techniques based on placement and routing. Different register file models are studied and evaluated on a state-of-the art coarse-grained reconfigurable array DSP, together with a new post-pass register allocat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2010
2010
2021
2021

Publication Types

Select...
3
2
1

Relationship

2
4

Authors

Journals

citations
Cited by 14 publications
(25 citation statements)
references
References 24 publications
0
25
0
Order By: Relevance
“…Using (3-4), gr1+0 will be self-updated as gr1+=4. Applying to (3)(4)(5), an execution unit for the add is configured with a self-forwarding. According to (4-1), since results of add and sub can be supplied to execution units in stage 1 by using a forwarding path from EXEC output registers, prop skp[gr1], prop skp[gr3] and prop skp[z] in stage 1 are set to one at step 1.…”
Section: Implementation and Examplementioning
confidence: 99%
See 3 more Smart Citations
“…Using (3-4), gr1+0 will be self-updated as gr1+=4. Applying to (3)(4)(5), an execution unit for the add is configured with a self-forwarding. According to (4-1), since results of add and sub can be supplied to execution units in stage 1 by using a forwarding path from EXEC output registers, prop skp[gr1], prop skp[gr3] and prop skp[z] in stage 1 are set to one at step 1.…”
Section: Implementation and Examplementioning
confidence: 99%
“…A research by Hrishikesh, et al [11] indicated that an optimal delay for one stage is around 6 to 8 FO4. Viji Srinivasan, et al [12] showed that an optimal design point based on a power-performance metric ((Billions of Instructions Per Second) 3 /Watt) is 18 FO4 per pipeline stage. It is our future work to find optimal depth of pipeline stages for LAPP.…”
Section: Circuit Area and Delay Timementioning
confidence: 99%
See 2 more Smart Citations
“…Modulo scheduling techniques for CGRAs [15,17,20,39,45,46] only schedule loops that are free of control flow transfers. Hence any loop body that contains conditional statements first needs to be if-converted into hyperblocks by means of predication [36].…”
Section: Predicationmentioning
confidence: 99%