Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays 2022
DOI: 10.1145/3490422.3502357
|View full text |Cite
|
Sign up to set email alerts
|

Sextans: A Streaming Accelerator for General-Purpose Sparse-Matrix Dense-Matrix Multiplication

Abstract: Sparse-Matrix Dense-Matrix multiplication (SpMM) is the key operator for a wide range of applications including scientific computing, graph processing, and deep learning. Architecting accelerators for SpMM is faced with three challenges -(1) the random memory accessing and unbalanced load in processing because of random distribution of elements in sparse matrices, (2) inefficient data handling of the large matrices which can not be fit on-chip, and (3) a non-general-purpose accelerator design where one acceler… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 41 publications
(18 citation statements)
references
References 61 publications
0
2
0
Order By: Relevance
“…Serpens [20] and HiSparse [21] are two state-of-the-art SpMV accelerators on FPGAs and both target HBM. In both cases, however, the data type of their architecture is either fixed point or single-precision floating point.…”
Section: A Double-precision Spmv On Fpgas With Hbmmentioning
confidence: 99%
See 1 more Smart Citation
“…Serpens [20] and HiSparse [21] are two state-of-the-art SpMV accelerators on FPGAs and both target HBM. In both cases, however, the data type of their architecture is either fixed point or single-precision floating point.…”
Section: A Double-precision Spmv On Fpgas With Hbmmentioning
confidence: 99%
“…There have been different proposals for FPGA-specific matrix encodings and algorithms [16], [20], [21], [23], [25]. Some of these even allow on-the-fly transformation from multiple formats inside the FPGA [19].…”
Section: B Sparse Matrix Representation Formatsmentioning
confidence: 99%
“…Thus, the main challenge is to deal with the mismatch between the throughput of the transferred non-zero matrix elements (i.e., the throughput of the off-chip memory bandwidth) and the throughput of the vector buffer. Previous work [4][3] holds multiple copies of the input vector to increase the throughput of the on-chip vector buffer. However, they fail to explore data reuse of fetched vector values which can further compress sparse matrices.…”
Section: Spmv and Challengesmentioning
confidence: 99%
“…There are some works targeting HBMbased FPGAs. Serpens [4] proposes memory-centric processing engines to fully exploit the benefits of HBM. It proposes an index coalescing technique to improve URAM utilization and non-zero reordering to avoid URAM address conflicts.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation