Proceedings of the 2016 International Conference on Parallel Architectures and Compilation 2016
DOI: 10.1145/2967938.2967940
|View full text |Cite
|
Sign up to set email alerts
|

Scheduling Techniques for GPU Architectures with Processing-In-Memory Capabilities

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
98
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 149 publications
(98 citation statements)
references
References 90 publications
0
98
0
Order By: Relevance
“…The logic considered varies in type e.g. simple in-order cores [4], [29], [32], [37], [51], [52], [54], [55], [58], graphics processing units [34], [38], [46], [48], field programmable gate arrays [41], [43], [49] and application specific accelerators [30], [39], [40], [45], [47], [50], [53], [56]. Majority of the NMC proposals are targeted towards different types of data processing applications e.g.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The logic considered varies in type e.g. simple in-order cores [4], [29], [32], [37], [51], [52], [54], [55], [58], graphics processing units [34], [38], [46], [48], field programmable gate arrays [41], [43], [49] and application specific accelerators [30], [39], [40], [45], [47], [50], [53], [56]. Majority of the NMC proposals are targeted towards different types of data processing applications e.g.…”
Section: Discussionmentioning
confidence: 99%
“…Each operation is issued by the main application running on the host and served by a control program loaded by the OS on each DRE engine. Similar to [48] the authors propose to invalidate the CPU caches after each fill and drain operation to keep memory consistency between the nearmemory processors and the main CPU. As pointed out earlier, this approach can introduce a significant overhead.…”
Section: Reconfigurable Unitmentioning
confidence: 99%
“…These cores promise a 5× increase in Deep Learning performance compared to previous GPU generations [38]. Furthermore, GPUs have been shown to be amenable to near-memory processing as well [39].…”
Section: Gpusmentioning
confidence: 99%
“…The proposed method by Pattnaik et al [34] also employs some metrics for the purpose of kernel classification into either the GPU-PIM or GPU-PIC class, where PIM and PIC stand for processingin-memory and processing-in-core, respectively. However, the four static metrics used in [34] are those related to the underlying hardware and so are subject to change from hardware to hardware. Therefore, a given hardware may require reading three metrics in category I, while, for another hardware, five metrics must be read.…”
Section: Related Workmentioning
confidence: 99%