2021
DOI: 10.1109/mcse.2021.3057203
|View full text |Cite
|
Sign up to set email alerts
|

Accelerating Scientific Applications With SambaNova Reconfigurable Dataflow Architecture

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 39 publications
(12 citation statements)
references
References 2 publications
0
11
0
Order By: Relevance
“…Rocki et al [2020] discussed the possibility of using DCAI systems for PDE codes in scientific applications, and demonstrated the benefit over using conventional CPU or GPU based solutions. Emani et al [2021] explored the suitability of SambaNova, another DCAI system, for diverse AI for Science workloads and observed significant performance gains over traditional hardware. Acciarri et al [2020] advanced state-of-the-art accuracy for an important neutrino physics image segmentation problem leveraging the large memory of DCAI systems which are not possible to fit on the highest-end GPU because of the large tensor size (convolutions neural networks with images beyond 50k x 50k resolution).…”
Section: Resultsmentioning
confidence: 99%
“…Rocki et al [2020] discussed the possibility of using DCAI systems for PDE codes in scientific applications, and demonstrated the benefit over using conventional CPU or GPU based solutions. Emani et al [2021] explored the suitability of SambaNova, another DCAI system, for diverse AI for Science workloads and observed significant performance gains over traditional hardware. Acciarri et al [2020] advanced state-of-the-art accuracy for an important neutrino physics image segmentation problem leveraging the large memory of DCAI systems which are not possible to fit on the highest-end GPU because of the large tensor size (convolutions neural networks with images beyond 50k x 50k resolution).…”
Section: Resultsmentioning
confidence: 99%
“…NPUs: Examples of NPUs include Google's Tensor Processing Unit (TPU) [20], tensor cores in NVIDIA A100 Ampere architecture, Samsung NPU [21], Sambanova's RDU [22], IBM's AI Accelerator [23], Microsoft Brainwave [24], Tesla's Self-Driving computer [25], Facebook's ML accelerator [26], etc. NPU architectures can be standalone, a co-processor, or a near-data processing engine [27]- [29].…”
Section: A Npu Design Requirements and Challengesmentioning
confidence: 99%
“…For multi-user facilities, reconfigurability is essential; thus our interest in an architecture that can be programmed specifically for any model application but nevertheless results in an application-specific optimized accelerator. The core of the SambaNova Reconfigurable Dataflow Architecture TM (RDA) [14,15] is a dataflow-optimized processor, the Reconfigurable Dataflow Unit TM (RDU). It has a tiled architecture that is made up of a network of programmable compute (PCUs), memory (PMUs) and communication units.…”
Section: Sambanova Reconfigurable Dataflow Architecturementioning
confidence: 99%
“…The SambaFlow framework automatically handles the parallelization used here for data parallel training with the DataScale ® platform, a rack-level, datacenter accelerated computing platform. The platform consists of one or more DataScale SN10-8 nodes with integrated networking and management infrastructure in a standards-compliant data center rack-the DataScale SN10-8R [15]. We used up to 4 SN10-8 nodes for the results presented here, each consisting of a host module and 8 RDUs.…”
Section: Distributed Trainingmentioning
confidence: 99%