2022 IEEE Hot Chips 34 Symposium (HCS) 2022
DOI: 10.1109/hcs55958.2022.9895629
|View full text |Cite
|
Sign up to set email alerts
|

System Architecture and Software Stack for GDDR6-AiM

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 18 publications
(9 citation statements)
references
References 0 publications
0
9
0
Order By: Relevance
“…To support the second and third methods, we should flush cached PIM operands into DRAM before the PIM execution and invalidate them. The cache flush and invalidation incur significant overhead, so most PIM studies declared the PIM operands as uncached attributes [22], [23], [24], [25], [26]. However, the CPU access to the uncached data is too slow because of their strictly ordered memory operations.…”
Section: A Pim Execution Behavior Vs Dmamentioning
confidence: 99%
See 4 more Smart Citations
“…To support the second and third methods, we should flush cached PIM operands into DRAM before the PIM execution and invalidate them. The cache flush and invalidation incur significant overhead, so most PIM studies declared the PIM operands as uncached attributes [22], [23], [24], [25], [26]. However, the CPU access to the uncached data is too slow because of their strictly ordered memory operations.…”
Section: A Pim Execution Behavior Vs Dmamentioning
confidence: 99%
“…Processing-in-Memory (PIM) architectures have been actively studied by placing computing units close to [9], [10], [11], [12], and [13] or inside memory [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26] to overcome the memory bandwidth limitation. PIM can maximize internal memory bandwidth for the computation using bank-level parallelism [14], [15], [17], [18], [22], [23], [24], [25], [26], thus providing high computation performance. For example, the decoupled PIM [26] achieved a speedup of 75.8x and 1.2x over CPU and GPU at the Level-3 BLAS, respectively.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations