2021 Design, Automation &Amp; Test in Europe Conference &Amp; Exhibition (DATE) 2021
DOI: 10.23919/date51398.2021.9474230
|View full text |Cite
|
Sign up to set email alerts
|

Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra

Abstract: Sparse linear algebra is crucial in many application domains, but challenging to handle efficiently in both software and hardware, with one-and two-sided operand sparsity handled with distinct approaches. In this work, we enhance an existing memorystreaming RISC-V ISA extension to accelerate both one-and two-sided operand sparsity on widespread sparse tensor formats like compressed sparse row (CSR) and compressed sparse fiber (CSF) by accelerating the underlying operations of streaming indirection, intersectio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
6
1

Relationship

4
3

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 34 publications
0
4
0
Order By: Relevance
“…One could expect new breakthroughs to enable higher sparsity closer to those in scientific computing (>99.9%). Then, another class of accelerators, such as SpArch , Indirection Stream Semantic Registers [Scheffler et al 2020], or Extensor [Hegde et al 2019] would play a bigger role. 7.2.3 Overview of accelerators for sparse deep learning.…”
Section: Training Acceleratorsmentioning
confidence: 99%
“…One could expect new breakthroughs to enable higher sparsity closer to those in scientific computing (>99.9%). Then, another class of accelerators, such as SpArch , Indirection Stream Semantic Registers [Scheffler et al 2020], or Extensor [Hegde et al 2019] would play a bigger role. 7.2.3 Overview of accelerators for sparse deep learning.…”
Section: Training Acceleratorsmentioning
confidence: 99%
“…In our experiments, the lower bandwidth was compensated by the reduction in the amount of processed data, and the Flare in-network sparse allreduce outperformed the Spar-CML host-based sparse allreduce. However, we believe that there is still space for improvement, either by optimizing the handlers code or by introducing hardware support to optimize indirect memory accesses [84].…”
Section: Discussionmentioning
confidence: 99%

Flare: Flexible In-Network Allreduce

De Sensi,
Di Girolamo,
Ashkboos
et al. 2021
Preprint
Self Cite
“…These works mainly focus on proposing extensions on instruction set architecture, such as new instructions and new core-side micro-architecture, where main idea is to introduce memory stream in ISA and decouple computation and memory access [7,21,39,40,25,41]. For example, Wang al.…”
Section: Core-side Stream Extensionsmentioning
confidence: 99%