Proceedings of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems 2016
DOI: 10.1145/2968455.2968510
|View full text |Cite
|
Sign up to set email alerts
|

Hybrid network-on-chip architectures for accelerating deep learning kernels on heterogeneous manycore platforms

Abstract: In recent years, designing specialized manycore heterogeneous architectures for deep learning kernels has become an area of great interest. However, the typical on-chip communication infrastructures employed on conventional manycore platforms are unable to handle both CPU and GPU communication requirements efficiently. Hence, in this paper, our aim is to enhance the performance of heterogeneous manycore architectures through the design of a hybrid NoC consisting of both wireline and wireless links. To this end… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 29 publications
(11 citation statements)
references
References 35 publications
(66 reference statements)
0
11
0
Order By: Relevance
“…8, it is evident that even in this optimized NoC, a few links are more heavily utilized when compared to the rest of the links. In a previous work [49], it was demonstrated that links associated with MCs have nearly 500% higher traffic density than the overall average link utilization for the Rodinia backpropagation benchmark [50]. However, this backpropagation benchmark is much simpler than the workloads addressed in this work and only employs a single NN layer.…”
Section: Gpusmentioning
confidence: 92%
“…8, it is evident that even in this optimized NoC, a few links are more heavily utilized when compared to the rest of the links. In a previous work [49], it was demonstrated that links associated with MCs have nearly 500% higher traffic density than the overall average link utilization for the Rodinia backpropagation benchmark [50]. However, this backpropagation benchmark is much simpler than the workloads addressed in this work and only employs a single NN layer.…”
Section: Gpusmentioning
confidence: 92%
“…To handle this challenge, Choi et al [42] has proposed a hybrid (wired+wirelss) NoC architecture for heterogeneous CMPs which specifically targets the training phase of DNNs. In the proposed architecture, as the CPU to the memory controller (MC) communications are latency-sensitive, this type of data exchange is carried out through the single-hop wireless interconnects.…”
Section: B Wireless Interconnectsmentioning
confidence: 99%
“…In recent years, the new interconnection techniques, such as 3D vertical on-chip interconnection [36]- [41], wireless interconnection [42]- [44], and optical interconnection [45], [46], etc., brought the performance revolution to DNN computing. As mentioned before, memory access latency dominates the overall DNN performance, which promotes the research about in/near-memory processing techniques.…”
Section: Introductionmentioning
confidence: 99%
“…NoC is usually used for speci c designed multiple-chips solutions. Choi et al [31] proposed a hybrid NoC architecture that combines CPUs and GPUs on a single chip for DL. Meanwhile, the hybrid network-on-chip architecture also introduces wireless links in CPUs and GPUs communication.…”
Section: Distributed Systemmentioning
confidence: 99%