2016 Third Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC) 2016
DOI: 10.1109/llvm-hpc.2016.007
|View full text |Cite
|
Sign up to set email alerts
|

Towards Automatic HBM Allocation Using LLVM: A Case Study with Knights Landing

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
17
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(18 citation statements)
references
References 12 publications
1
17
0
Order By: Relevance
“…Even the bandwidth advantage of MCDRAM is realized only when multiple threads access MCDRAM simultaneously. 12,55 For example, Ramos et al 12 show that for bitonic sort (a memory-bound code), use of MCDRAM provides no benefit over use of DRAM. This happens because in bitonic sort, all the cores access memory only in the initial stage, where the size of merged arrays is small.…”
Section: Ta B L E 3 Optimization Strategiesmentioning
confidence: 99%
“…Even the bandwidth advantage of MCDRAM is realized only when multiple threads access MCDRAM simultaneously. 12,55 For example, Ramos et al 12 show that for bitonic sort (a memory-bound code), use of MCDRAM provides no benefit over use of DRAM. This happens because in bitonic sort, all the cores access memory only in the initial stage, where the size of merged arrays is small.…”
Section: Ta B L E 3 Optimization Strategiesmentioning
confidence: 99%
“…Sim et al [38] introduce an extra level of address translation from physical addresses into DRAM addresses, which allows the hardware to swap pages between the stacked DRAM and the off-chip memory without OS intervention. Khaldi et al [22] propose to automate the allocation of data objects at compile time. In this approach the compiler analyses the code to detect frequently used data and memory access patterns, it uses this information to calculate a priority value for each data object, and then it generates the appropriate functions calls to allocate the data objects in the stacked DRAM or in the off-chip memory.…”
Section: Software Management Of Heterogeneous Memory Systemsmentioning
confidence: 99%
“…On the other hand, the selected scopes are speeding-up the application by a factor S at best. Allocating studied data on this memory, program execution time can be expressed as a function of x i , y i , z i , s and S (see equation 14).…”
Section: Scope-based Dataset Reductionmentioning
confidence: 99%
“…Some static metrics have been tested [14], but they are difficult to manage on mini-application bigger than small benchmarks, and the information extracted are not relevant enough. Besides, pointer aliasing makes it very difficult, if not impossible, to have good information on all data objects.…”
Section: Formulation (I)mentioning
confidence: 99%
See 1 more Smart Citation