2007
DOI: 10.1007/s11265-007-0059-4
|View full text |Cite
|
Sign up to set email alerts
|

Effective Code Generation for Distributed and Ping-Pong Register Files: A Case Study on PAC VLIW DSP Cores

Abstract: Abstract. The compiler is generally regarded as the most important software component that supports a processor design to achieve success. This paper describes our application of the open research compiler infrastructure to a novel VLIW DSP (known as the PAC DSP core) and the specific design of code generation for its register file architecture. The PAC DSP utilizes port-restricted, distributed, and partitioned register file structures in addition to a heterogeneous clustered data-path architecture to attain l… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2010
2010
2014
2014

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 19 publications
0
5
0
Order By: Relevance
“…Functional units are grouped into two clusters: one cluster for two-issue memory operations and the other for fourissue arithmetic operations. The PAC-DSP core provided by the Industry Technology Research Institute (ITRI) [Lin et al 2008] is also a clustered architecture using a pingpong register file design-a register file with reconfigurable connection to functional units-for intercluster data exchange with a limited amount of access ports.…”
Section: Trends In Energy-efficient Vliw Architectures Designmentioning
confidence: 99%
See 3 more Smart Citations
“…Functional units are grouped into two clusters: one cluster for two-issue memory operations and the other for fourissue arithmetic operations. The PAC-DSP core provided by the Industry Technology Research Institute (ITRI) [Lin et al 2008] is also a clustered architecture using a pingpong register file design-a register file with reconfigurable connection to functional units-for intercluster data exchange with a limited amount of access ports.…”
Section: Trends In Energy-efficient Vliw Architectures Designmentioning
confidence: 99%
“…A DSP core is mainly used to process streamed data with a kernel loop. Various researchers have proposed software pipelining algorithms to improve the performance of clustered VLIW architectures [Akturan and Jacome 2001;Qian et al 2002aQian et al , 2002bZalamea et al 2001]. This article redefines the instruction scheduling problem from an alternative viewpoint: deadline-constrained energy optimization for energy-proportional computing.…”
Section: Trends In Energy-efficient Vliw Architectures Designmentioning
confidence: 99%
See 2 more Smart Citations
“…The ORC frontend helps to generate the intermediate representation, WHIRL, with five representation levels from "very high" to "very low", where various targetindependent optimizations are performed, such as control flow optimization, extended basic block (peephole) optimization, integrated global/local scheduling, and loop transformation at the "very low" level. We have developed specific optimization techniques in the backend for PACDSP, including copy propagation for irregular register files [17], optimal local register file assignment based on simulation annealing (SA-LRFA) [18], ping-pong aware & local favorable register file assignment (PALF-LRFA) [19], and local-conscious & global register file assignment (LC-GRFA) [20], etc. LC-GRFA is the most important optimization, which minimizes data communication costs between various registers.…”
Section: Software Development Toolsmentioning
confidence: 99%