2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing 2013
DOI: 10.1109/pdp.2013.65
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation of Successive CPUs/APUs/GPUs Based on an OpenCL Finite Difference Stencil

Abstract: International audienceThe AMD APU (Accelerated Processing Unit) architecture, which combines CPU and GPU cores on the same die, is promising for GPU applications which performance is bottlenecked by the low PCI Express communication rate. However the first APU generations still have different CPU and GPU memory partitions. Currently, the APU integrated GPUs are also less powerful than discrete GPUs. In this paper we therefore investigate the interest of APUs for scientific computing by evaluating and comparing… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
1
0
1

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(3 citation statements)
references
References 3 publications
0
1
0
1
Order By: Relevance
“…Access to the main memory is often shared among components; on heterogeneous CPU-GPU platforms, both components can access the main memory, and the GPU can use both coherent and noncoherent communication [32].…”
Section: Chip-level Heterogeneitymentioning
confidence: 99%
“…Access to the main memory is often shared among components; on heterogeneous CPU-GPU platforms, both components can access the main memory, and the GPU can use both coherent and noncoherent communication [32].…”
Section: Chip-level Heterogeneitymentioning
confidence: 99%
“…In [24], the authors implement a seismic model using task-based programming but they don't use heterogeneous cores for the simulations. Calandra et al [25] evaluate execution of a finite-differences stencil on CPUs, APUs and GPUs but they don't combine heterogeneous cores, neither parallel tasks programming. Figure 7: Impact of the granularity on the efficiency of seismic wave modelling on GPUs+CPUs.…”
Section: Related Workmentioning
confidence: 99%
“…Com isso é possível fazer operações do tipo DMA (direct memory access), habilitando a sobreposição das cópias de memória com execuções de kernels (em CUDA esta sobreposição é feita com as chamadas assíncronas, que só funcionam se a memória alocada na CPU for pinned [68]). O uso desse tipo de memória para transferências CPU ⇔ dispositivo é recomendado pois essas transferências ocorrem no barramento PCI Express, que é o gargalo [69] de muitas arquiteturas heterogêneas.…”
Section: Memória Para Transferências Entre Cpu E Dispositivounclassified