Proceedings of the 7th ACM International Conference on Computing Frontiers 2010
DOI: 10.1145/1787275.1787328
|View full text |Cite
|
Sign up to set email alerts
|

On-chip communication and synchronization mechanisms with cache-integrated network interfaces

Abstract: Per-core local (scratchpad) memories allow direct inter-core communication, with latency and energy advantages over coherent cache-based communication, especially as CMP architectures become more distributed. We have designed cache-integrated network interfaces (NIs), appropriate for scalable multicores, that combine the best of two worlds -the flexibility of caches and the efficiency of scratchpad memories: on-chip SRAM is configurably shared among caching, scratchpad, and virtualized NI functions. This paper… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
27
0
2

Year Published

2010
2010
2019
2019

Publication Types

Select...
5
3
1

Relationship

2
7

Authors

Journals

citations
Cited by 30 publications
(30 citation statements)
references
References 30 publications
1
27
0
2
Order By: Relevance
“…This paper extends on our previous work in [17]. Here, we elaborate on the architecture of cache-integrated network interfaces and the technique of event responses that enables their efficient implementation, and also measure the logic overhead of NI integration inside a cache.…”
Section: Introductionmentioning
confidence: 86%
“…This paper extends on our previous work in [17]. Here, we elaborate on the architecture of cache-integrated network interfaces and the technique of event responses that enables their efficient implementation, and also measure the logic overhead of NI integration inside a cache.…”
Section: Introductionmentioning
confidence: 86%
“…Interconnect design is one of the two open issues along with programming model in multicore system design [Rutzig, 2013]. Although data communication is a primary anticipated bottleneck for system performance [Dally and Towles, 2007;Kavadias et al, 2010;Orduña et al, 2004], the interconnect design for data communication among the accelerator kernels has not been well addressed in hardware accelerator systems. A simple bus or shared memory is usually used for data communication between the host and the kernels 1 as well as among the kernels.…”
Section: Problem Overviewmentioning
confidence: 99%
“…In data intensive applications, a large amount of data needs to be transferred from core to core. Therefore, data communication is usually a primary anticipated bottleneck for system performance [Altera, 2008;Becker et al, 2007;Donchev et al, 2006;Kavadias et al, 2010]. One important method to improve the performance of such systems is reducing data communication overhead.…”
Section: Introductionmentioning
confidence: 99%
“…Our work on explicit communication and synchronization for the SARC architecture includes an FPGA prototype described in [2] and a longer description of the architecture, with performance measurements collected on the FPGA prototype [13].…”
Section: Related Workmentioning
confidence: 99%