Code Compression and Decompression for Instruction Cell Based Reconfigurable Systems

Aslam, N.; Milward, M.; Nousias, I.; Arslan, Tughrul; Erdogan, A.T.

doi:10.1109/ipdps.2007.370392

Cited by 7 publications

(2 citation statements)

References 9 publications

(9 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…They achieved up to 52% of memory reduction. Aslam et al [1][2][3] applied state-of-the-art dictionary methods to their large 8x8 CGRA. Similar to the approach presented here, PEs are reorganized to improve compression in the dictionary.…”

Section: Dictionary-based Compression Techniquesmentioning

confidence: 99%

Improving Energy Efficiency of Coarse-Grain Reconfigurable Arrays Through Modulo Schedule Compression/Decompression

Lee

Moghaddam

Suh

et al. 2018

ACM Trans. Archit. Code Optim.

View full text Add to dashboard Cite

Modulo-scheduled course-grain reconfigurable array (CGRA) processors excel at exploiting loop-level parallelism at a high performance per watt ratio. The frequent reconfiguration of the array, however, causes between 25% and 45% of the consumed chip energy to be spent on the instruction memory and fetches therefrom. This article presents a hardware/software codesign methodology for such architectures that is able to reduce both the size required to store the modulo-scheduled loops and the energy consumed by the instruction decode logic. The hardware modifications improve the spatial organization of a CGRA's execution plan by reorganizing the configuration memory into separate partitions based on a statistical analysis of code. A compiler technique optimizes the generated code in the temporal dimension by minimizing the number of signal changes. The optimizations achieve, on average, a reduction in code size of more than 63% and in energy consumed by the instruction decode logic by 70% for a wide variety of application domains. Decompression of the compressed loops can be performed in hardware with no additional latency, rendering the presented method ideal for low-power CGRAs running at high frequencies. The presented technique is orthogonal to dictionary-based compression schemes and can be combined to achieve a further reduction in code size. CCS Concepts: • Computer systems organization → Reconfigurable computing; • Hardware → Power estimation and optimization; • Software and its engineering → Compilers;

show abstract

Section: Dictionary-based Compression Techniquesmentioning

confidence: 99%

Improving Energy Efficiency of Coarse-Grain Reconfigurable Arrays Through Modulo Schedule Compression/Decompression

Lee

Moghaddam

Suh

et al. 2018

ACM Trans. Archit. Code Optim.

View full text Add to dashboard Cite

show abstract

“…However, with the addition of suitable code compression techniques, this overhead can be reduced. The reduction in code size is larger with this reconfigurable architecture than using the same technique with a conventional processor, due to a higher degree of regularity (repetitions) in the configuration data [17].…”

Section: The Multitasking Implementation Of the Wimax On Micro C/os-iimentioning

confidence: 99%

The Design of Multitasking Based Applications on Reconfigurable Instruction Cell Based Architectures

Han

Nousias

Mair

et al. 2007

2007 International Conference on Field Programmable Logic and Applications

Self Cite

View full text Add to dashboard Cite

This paper presents a new direct implementation of a popular RTOS with an associated application -the WiMAX physical layer -on reconfigurable computing architectures. A novel coarse-grained reconfigurable instruction cell based architecture is chosen as the target architecture. Firstly an RTOS -Micro C/OS-II-was ported to the target architecture, and then the WiMAX physical layer program was partitioned into multiple OS tasks which communicate with each other through the synchronization approaches provided by this RTOS. The WiMAX physical layer program has been also implemented on the ARM7TDMI processor. The results show that the performance of the target architecture is much better than the ARM7TDMI, and not limited by the bottleneck of memory latency.

show abstract

Design and evaluation of compact ISA extensions

Lopes

Ecco

Xavier

et al. 2016

Microprocessors and Microsystems

View full text Add to dashboard Cite

Code Compression and Decompression for Instruction Cell Based Reconfigurable Systems

Abstract: Abstract

Cited by 7 publications

References 9 publications

Improving Energy Efficiency of Coarse-Grain Reconfigurable Arrays Through Modulo Schedule Compression/Decompression

Improving Energy Efficiency of Coarse-Grain Reconfigurable Arrays Through Modulo Schedule Compression/Decompression

The Design of Multitasking Based Applications on Reconfigurable Instruction Cell Based Architectures

Design and evaluation of compact ISA extensions

Contact Info

Product

Resources

About