2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) 2020
DOI: 10.1109/isca45697.2020.00013
|View full text |Cite
|
Sign up to set email alerts
|

High-Performance Deep-Learning Coprocessor Integrated into x86 SoC with Server-Class CPUs Industrial Product

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
9

Relationship

1
8

Authors

Journals

citations
Cited by 12 publications
(6 citation statements)
references
References 14 publications
0
6
0
Order By: Relevance
“…2) Target Architecture: CoSA targets spatial architectures with an array of processing elements (PEs) connected via an on-chip network and with multiple levels of memory hierarchy, a commonly adopted architecture template in today's DNN accelerator designs [18], [19], [28], [29], [36], [44], [54], [55], [60], [69], [71].…”
Section: A Cosa Overviewmentioning
confidence: 99%
See 1 more Smart Citation
“…2) Target Architecture: CoSA targets spatial architectures with an array of processing elements (PEs) connected via an on-chip network and with multiple levels of memory hierarchy, a commonly adopted architecture template in today's DNN accelerator designs [18], [19], [28], [29], [36], [44], [54], [55], [60], [69], [71].…”
Section: A Cosa Overviewmentioning
confidence: 99%
“…DNN-based approaches have been applied to computer vision [34], [43], [57], machine translation [64], [68], audio synthesis [66], recommendation models [31], [46], autonomous driving [11] and many other fields. Motivated by the high computational requirements of DNNs, there have been exciting developments in both research and commercial spaces in building specialized DNN accelerators for both edge [1], [16], [17], [26], [50], [61], [63], [72] and cloud applications [5], [19], [27], [36], [39], [69].…”
Section: Introductionmentioning
confidence: 99%
“…Some portions of the NS-DR and NSCL are more suited for acceleration than others. The CNN-based video frame parser and the transformer-based question parser model can be accelerated with existing GPU or accelerator architectures [34]. On the other hand, the NS-DR's dynamics predictor is severely data-movement-bound due to its many sparse and small tensor operations, and also suffers from the difficulty of multiplying "tall-and-skinny" matrices.…”
Section: A Nscl and Ns-drmentioning
confidence: 99%
“…• SambaNova has released some impressive benchmark results for their reconfigurable AI accelerator technology, but they still have not provided any details from which we can estimate peak performance or power consumption of their solutions [123]. • The Centaur Technology CNS processor [124], [125] includes eight x86 cores along with an integrated AI accelerator realized as a 4,096 byte-wide SIMD unit. The Centaur AI coprocessor (CT-AIC) will delivers peak performance of 20 TOPS with INT8 precision at 2.5 GHz and can also operate at FP16 and INT16 precisions, though at lower performance.…”
Section: A New Acceleratorsmentioning
confidence: 99%