Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques 2012
DOI: 10.1145/2370816.2370824
|View full text |Cite
|
Sign up to set email alerts
|

Fast and efficient automatic memory management for GPUs using compiler-assisted runtime coherence scheme

Abstract: Exploiting the performance potential of GPUs requires managing the data transfers to and from them efficiently which is an errorprone and tedious task. In this paper, we develop a software coherence mechanism to fully automate all data transfers between the CPU and GPU without any assistance from the programmer. Our mechanism uses compiler analysis to identify potential stale accesses and uses a runtime to initiate transfers as necessary. This allows us to avoid redundant transfers that are exhibited by all ot… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0

Year Published

2014
2014
2020
2020

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 35 publications
(17 citation statements)
references
References 15 publications
0
17
0
Order By: Relevance
“…Like previous memory management systems for GPUs ( [13,17,12,9]), SemCache maintains what amounts to a distributed shared memory (DSM) between the CPU and GPU. SemCache tracks shared data at a variable granularity-as memory ranges.…”
Section: Semcachementioning
confidence: 99%
See 3 more Smart Citations
“…Like previous memory management systems for GPUs ( [13,17,12,9]), SemCache maintains what amounts to a distributed shared memory (DSM) between the CPU and GPU. SemCache tracks shared data at a variable granularity-as memory ranges.…”
Section: Semcachementioning
confidence: 99%
“…Prior work has used compiler analysis or programmer annotations to determine if the operation is a read or a write [13,12,9,17]. Since SemCache++ focuses on libraries, it can use simple directives inserted into the library code to indicate which matrices are read and written by the GPU, as well as which submatrices are needed by tasks dispatched to various GPUs.…”
Section: Instrumenting Gpu Reads and Writesmentioning
confidence: 99%
See 2 more Smart Citations
“…Pai et al propose a system that automates CPU-GPU memory management based on a coherence scheme in order to reduce superfluous communication [14]. To do this, when a data item is accessed on one side (CPU or GPU side), it is transferred (from the other side) if it is not locally available or if its local version is stale.…”
Section: Related Workmentioning
confidence: 99%