Proceedings of the ACM International Conference on Computing Frontiers 2013
DOI: 10.1145/2482767.2482770
|View full text |Cite
|
Sign up to set email alerts
|

System integration of tightly-coupled processor arrays using reconfigurable buffer structures

Abstract: As data locality is a key factor for the acceleration of loop programs on processor arrays, we propose a buffer architecture that can be configured at run-time to select between different schemes for memory access. In addition to traditional address-based memory banks, the buffer architecture can deliver data in a streaming manner to the processing elements of the array, which supports dense and sparse stencil operations. Moreover, to minimize data transfers to the buffers, the design contains an interlinked m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2016
2016
2019
2019

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 10 publications
0
4
0
Order By: Relevance
“…Data transfer between the global memory and the PE array is only performed through the PEs at the border of the array, which is connected to a set of surrounding RBs. These buffers can be configured, to either work as simple FIFOs or as RAM‐based addressable memory banks 34 . The AGs generate the correct sequence of read/write accesses according to a given loop schedule, and they work in parallel with the main computational units or processors to ensure efficient storage/feed of data to/from the main memory.…”
Section: Invasive Computingmentioning
confidence: 99%
“…Data transfer between the global memory and the PE array is only performed through the PEs at the border of the array, which is connected to a set of surrounding RBs. These buffers can be configured, to either work as simple FIFOs or as RAM‐based addressable memory banks 34 . The AGs generate the correct sequence of read/write accesses according to a given loop schedule, and they work in parallel with the main computational units or processors to ensure efficient storage/feed of data to/from the main memory.…”
Section: Invasive Computingmentioning
confidence: 99%
“…Although this solution scales very well, the overhead for accessing shared resources may compromise the performance of such accelerators. As a solution for this challenge, we propose to use a very flexible buffer structure that can be configured at runtime either as addressable RAM or pixel buffer as first presented in [3].…”
Section: Introductionmentioning
confidence: 99%
“…In this paper, we propose to use such buffers for coupling a RISC processor directly to the border processing elements of a class of CGRAs called TCPAs (tightly coupled processor arrays [3]). We are using an edge detection algorithm as a case study to demonstrate how the image processing throughput can thereby be increased up to the memory bandwidth available in the system.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation