Proceedings of the 49th Annual International Symposium on Computer Architecture 2022
DOI: 10.1145/3470496.3527402
|View full text |Cite
|
Sign up to set email alerts
|

Gearbox

Abstract: Processing-in-memory (PIM) minimizes data movement overheads by placing processing units near each memory segment. Recent PIMs employ processing units with a SIMD architecture. However, kernels with random accesses, such as sparse-matrix-dense-vector (SpMV) and sparse-matrix-sparse-vector (SpMSpV), cannot effectively exploit the parallelism of SIMD units because SIMD's ALUs remain idle until all the operands are collected from local memory segments (memory segment attached to the processing unit) or remote mem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(2 citation statements)
references
References 54 publications
(99 reference statements)
0
2
0
Order By: Relevance
“…While Rowclone [40] proposes bulk data copy of a row data across different banks, significant data movement induced in data analytics are flexible that Rowclone cannot be effectively utilized. TransPIM [51] and GearBox [31] propose specific network-on-chip (NoC) for efficient DRAM internal data movement for target applications. However, their NoCs consume a large area overhead considering the DRAM area constraint.…”
Section: Challenge Of Data Analytics a Internal Data Movement Overhea...mentioning
confidence: 99%
See 1 more Smart Citation
“…While Rowclone [40] proposes bulk data copy of a row data across different banks, significant data movement induced in data analytics are flexible that Rowclone cannot be effectively utilized. TransPIM [51] and GearBox [31] propose specific network-on-chip (NoC) for efficient DRAM internal data movement for target applications. However, their NoCs consume a large area overhead considering the DRAM area constraint.…”
Section: Challenge Of Data Analytics a Internal Data Movement Overhea...mentioning
confidence: 99%
“…As a result, inevitable memory bottleneck problem drives both industry and academia to reassess the DRAM-based near-memory-processing (NMP) [7], [12], [19], [25], [50] and processing-in-memory (PIM) [13], [14], [16], [21], [22], [30], [31], [32], [34], [38], [48], [49] architectures that increase the internal bandwidth by integrating computational logic and DRAM device/cells closely. NMP architectures integrate homogeneous processing unit (PU) per vault in the base logic die of hybrid memory cube (HMC), supporting flexible dataflow for query operations.…”
Section: Introductionmentioning
confidence: 99%