Proceedings of the 45th Annual Design Automation Conference 2008
DOI: 10.1145/1391469.1391488
|View full text |Cite
|
Sign up to set email alerts
|

Dynamic register file resizing and frequency scaling to improve embedded processor performance and energy-delay efficiency

Abstract: With CMOS scaling leading to ever increasing levels of transistor integration on a chip, designers of high-performance embedded processors have ample area available to increase processor resources in order to improve performance. However, increasing resource sizes can increase power dissipation and also reduce access time, which can limit maximum achievable operating frequency. In this paper, we explore optimizations for the processor register file (RF), to improve performance and reduce the energy-delay produ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2008
2008
2020
2020

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 10 publications
0
9
0
Order By: Relevance
“…Adaptive processors (or reconfigurable processors) have already been studied to reduce power consumption [Albonesi et al 2003;Dhodapkar and Smith 2002;Huang et al 2003b]. In each component, such as the branch predictors and the buffers [Huang et al 2003a]; register files [Abella and Gonzalez 2003;Homayoun et al 2008]; issue queues [Cazorla et al 2004;Folegnani and González 2001;Petoumenos et al 2010];caches [Albonesi et al 2003;Qureshi and Patt 2006;Suh et al 2002]; functional units; and fetch, decode, and issue bandwidth [Albonesi et al 2003;Dhodapkar and Smith 2002;Huang et al 2003b], power gating techniques have also been proposed with minimal area and energy overheads to power down different sections, with negligible impact on the delay.…”
Section: Ideal Sea For An Smt Corementioning
confidence: 99%
“…Adaptive processors (or reconfigurable processors) have already been studied to reduce power consumption [Albonesi et al 2003;Dhodapkar and Smith 2002;Huang et al 2003b]. In each component, such as the branch predictors and the buffers [Huang et al 2003a]; register files [Abella and Gonzalez 2003;Homayoun et al 2008]; issue queues [Cazorla et al 2004;Folegnani and González 2001;Petoumenos et al 2010];caches [Albonesi et al 2003;Qureshi and Patt 2006;Suh et al 2002]; functional units; and fetch, decode, and issue bandwidth [Albonesi et al 2003;Dhodapkar and Smith 2002;Huang et al 2003b], power gating techniques have also been proposed with minimal area and energy overheads to power down different sections, with negligible impact on the delay.…”
Section: Ideal Sea For An Smt Corementioning
confidence: 99%
“…The functional unit conflicts occur when the processor pipeline has ready instructions, but there are no available functional units to execute them. As studied in several works, increasing the number of functional units in general purpose processors not only increases the power consumption of the processor but will also significantly affect the complexity of several pipeline stages including instruction queue, write-back buffers, bypass stage, register file design and could severely affect the processor performance, as the number of write-back ports increases significantly [16,17]. As studied in several works, increasing the number of functional units in general purpose processors not only increases the power consumption of the processor but will also significantly affect the complexity of several pipeline stages including instruction queue, write-back buffers, bypass stage, register file design and could severely affect the processor performance, as the number of write-back ports increases significantly [16,17].…”
Section: A Performancementioning
confidence: 99%
“…Note that in spite of high functional unit conflicts, it is not design efficient to increase the number of functional units in the processor pipeline, as the complexity of additional functional units will be significant [16,17,19]. Only increasing the total number of functional units (which are equivalent to the maximum issue width) from 4 to 6, increase the critical path delay and the total power of the processor by 21% [16,17]. Only increasing the total number of functional units (which are equivalent to the maximum issue width) from 4 to 6, increase the critical path delay and the total power of the processor by 21% [16,17].…”
Section: A Performancementioning
confidence: 99%
“…As for timing issue, our approach will increase the critical path delay for at most one XOR gate delay. However, in modern pipeline processors, the performance bottleneck is often found in the stage of register file access due to the increasing number of registers and the requirement of multi-port access [23], [24]. Our proposed expansion hardware is added to L1 data cache in the stage of memory access.…”
Section: • Set_expand_mask M;mentioning
confidence: 99%