2021 58th ACM/IEEE Design Automation Conference (DAC) 2021
DOI: 10.1109/dac18074.2021.9586257
|View full text |Cite
|
Sign up to set email alerts
|

RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 6 publications
0
6
0
Order By: Relevance
“…We use a 32×16 conventional WS systolic array as the baseline, inspired by RASA [29] and Intel's TMUL [26]. Comparing against a dense matrix engine rather than a vector engine provides a strong baseline due to the huge gap in compute throughput between typical matrix engines and vector engines (Section III-A).…”
Section: E Vegeta Design Overviewmentioning
confidence: 99%
See 4 more Smart Citations
“…We use a 32×16 conventional WS systolic array as the baseline, inspired by RASA [29] and Intel's TMUL [26]. Comparing against a dense matrix engine rather than a vector engine provides a strong baseline due to the huge gap in compute throughput between typical matrix engines and vector engines (Section III-A).…”
Section: E Vegeta Design Overviewmentioning
confidence: 99%
“…For modest-sized tiles, filling and draining the systolic array for a tile GEMM/SPMM instruction can significantly lower PE utilization. A recent work RASA [29] introduced pipelining the execution of tile GEMM instructions on a dense systolic array-based matrix engine; this overlaps different stages of execution for different instructions. We extend the pipelining concept and show how different tile SPMM instructions can be executed concurrently.…”
Section: Optimizationsmentioning
confidence: 99%
See 3 more Smart Citations