2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2018
DOI: 10.1109/ipdps.2018.00077
|View full text |Cite
|
Sign up to set email alerts
|

CoolPIM: Thermal-Aware Source Throttling for Efficient PIM Instruction Offloading

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
22
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 16 publications
(22 citation statements)
references
References 12 publications
0
22
0
Order By: Relevance
“…For DCC architectures, solutions can be divided into two main categories: (1) PIM systems, which perform computations using special circuitry inside the memory module or by taking advantage of particular aspects of the memory itself, e.g., simultaneous activation of multiple DRAM rows for logical operations [1,[11][12][13][14][15][16][17][18][19][20][21][22][23][24][25]; (2) NMP systems, which perform computations on a PE placed close to the memory module, e.g., CPU or GPU cores placed on the logic layer of 3D-stacked memory [26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42]. For the purposes of this survey, we classify systems that use logic layers in 3D-stacked memories as NMP systems, as these logic layers are essentially computational cores that are near the memory stack (directly underneath it).…”
Section: Data-centric Computing Architecturesmentioning
confidence: 99%
See 4 more Smart Citations
“…For DCC architectures, solutions can be divided into two main categories: (1) PIM systems, which perform computations using special circuitry inside the memory module or by taking advantage of particular aspects of the memory itself, e.g., simultaneous activation of multiple DRAM rows for logical operations [1,[11][12][13][14][15][16][17][18][19][20][21][22][23][24][25]; (2) NMP systems, which perform computations on a PE placed close to the memory module, e.g., CPU or GPU cores placed on the logic layer of 3D-stacked memory [26][27][28][29][30][31][32][33][34][35][36][37][38][39][40][41][42]. For the purposes of this survey, we classify systems that use logic layers in 3D-stacked memories as NMP systems, as these logic layers are essentially computational cores that are near the memory stack (directly underneath it).…”
Section: Data-centric Computing Architecturesmentioning
confidence: 99%
“…Offloading can be performed at different granularities, e.g., instructions (including small groups of instructions) [1,13,16,19,24,25,28,32,37,39,40,42,57,91,92], threads [71], Nvidia's CUDA blocks/warps [27,29], kernels [26], and applications [38,41,73,74]. Instruction-level offloading is often used with a fixed-function accelerator and PIM systems [1,13,16,19,24,25,28,29,32,37,39,42,57,92]. For example, [42] offloads atomic instructions at instruction-level granularity to a fixed-function near-memory graph accelerator.…”
Section: Data Offloading Granularitymentioning
confidence: 99%
See 3 more Smart Citations