2012 IEEE/IFIP 20th International Conference on VLSI and System-on-Chip (VLSI-SoC) 2012
DOI: 10.1109/vlsi-soc.2012.7332072
|View full text |Cite
|
Sign up to set email alerts
|

3D-LIN: A configurable low-latency interconnect for multi-core clusters with 3D stacked L1 memory

Abstract: Abstract-Shared L1 memories are of interest for tightlycoupled processor clusters in programmable accelerators as they provide a convenient shared memory abstraction while avoiding cache coherence overheads. The performance of a shared-L1 memory critically depends on the architecture of the low-latency interconnect between processors and memory banks, which needs to provide ultra-fast access to the largest possible L1 working set. The advent of 3D technology provides new opportunities to improve the interconne… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2013
2013
2013
2013

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(6 citation statements)
references
References 12 publications
0
6
0
Order By: Relevance
“…Performance limitations of the interconnection networks have led to a renewed interest in interconnect research and a transition from traditional bus-based systems to more sophisticated topologies, including mesh NoCs [8], hierarchical bus models [9], flattened butterfly on-chip networks [10] and crossbars [11][12][13]. The ability of crossbars to provide uniform access latency makes them an appealing option in processor-to-L1 memory interface for limited-cardinality clusters (16 PEs, typically), because predictable access latencies allow for quality-of-service guarantees and ease of programming.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…Performance limitations of the interconnection networks have led to a renewed interest in interconnect research and a transition from traditional bus-based systems to more sophisticated topologies, including mesh NoCs [8], hierarchical bus models [9], flattened butterfly on-chip networks [10] and crossbars [11][12][13]. The ability of crossbars to provide uniform access latency makes them an appealing option in processor-to-L1 memory interface for limited-cardinality clusters (16 PEs, typically), because predictable access latencies allow for quality-of-service guarantees and ease of programming.…”
Section: Related Workmentioning
confidence: 99%
“…The key difference between our proposed approach and these works is modularity, which allows stacking of several memory dies on a logic die without the need for new masks for each stacked die. Moreover, our solutions offer better scalability compared with [13] (more in-depth discussion is performed in Sections 4 and 6.2). Additionally, physical synthesis on realistic 3D floorplans make our obtained results more accurate.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations