2010
DOI: 10.1109/mm.2010.77
|View full text |Cite
|
Sign up to set email alerts
|

Explicit Communication and Synchronization in SARC

Abstract: SARC merges cache controller and network interface functions by relying on a single hardware primitive: each access checks the tag and the state of the addressed line for possible occurrence of events that may trigger responses like coherence actions, RDMA, synchronization, or configurable event notifications. The fully virtualized and protected user-level API is based on specially marked lines in the scratchpad space that respond as command buffers, counters, or queues. The runtime system maps communication a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2010
2010
2015
2015

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 29 publications
0
10
0
Order By: Relevance
“…Both types of packets also include some protocol/network headers. The r c\d ratio is used to evaluate the control relative to data traffic volume, similar to [11,17]. This work considers two corner cases, more control packets per data packet (r c\d > 1) and more data packets per control packet (r c\d < 1).…”
Section: A Noc Modelmentioning
confidence: 99%
“…Both types of packets also include some protocol/network headers. The r c\d ratio is used to evaluate the control relative to data traffic volume, similar to [11,17]. This work considers two corner cases, more control packets per data packet (r c\d > 1) and more data packets per control packet (r c\d < 1).…”
Section: A Noc Modelmentioning
confidence: 99%
“…On the other hand, using a shared queue results to higher communication traffic, since the packets have first to travel to the shared queue and then to the second stage (thus the shared queue results in higher power consumption). A detailed performance evaluation against other schemes such as directory-based cache coherence with or without hardware prefetching has been presented in [27].…”
Section: Shared Vs Distributed Queuesmentioning
confidence: 99%
“…The first hardware design project which uses the Formic board is a prototype of a non cache-coherent manycore architecture, based on ideas of the SARC project [3], which was fully implemented in software simulation and partially implemented on a XUPV5 hardware platform [4]. Each board fits in its FPGA eight CPUs, their private L1 and L2 caches, eight GTP links and a full network-on-chip centered around a 22-port crossbar.…”
Section: The Scalable Hardware Architecturementioning
confidence: 99%
“…The Counter & Mailbox (CMX) block keeps 128 counters that can be used to track the progress of ongoing DMA operations [3]. The counters can be polled by the CPU, send an interrupt and/or send notification packets to other counters when the programmed number of acknowledgment packets has been received.…”
Section: The Scalable Hardware Architecturementioning
confidence: 99%