2015 44th International Conference on Parallel Processing 2015
DOI: 10.1109/icpp.2015.105
|View full text |Cite
|
Sign up to set email alerts
|

Automatic OpenCL Code Generation for Multi-device Heterogeneous Architectures

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 10 publications
0
7
0
Order By: Relevance
“…We confirm the findings from Reference 4 that the MASON backend performed considerably better than the C with OpenMP backend when the agent number is high enough. Interestingly, the performance of the FPGA, which is restricted by the relatively low operating frequency and slow off-chip global memory access, is able to catch-up with OpenMP at 2 14 agents and even outperform it for the 2 16 agent scenario in both Game-of-Life and Sugarscape ( Figure 7A-C). This is caused by the efficient neighbor search and the pipelining parallelism the FPGA implements.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We confirm the findings from Reference 4 that the MASON backend performed considerably better than the C with OpenMP backend when the agent number is high enough. Interestingly, the performance of the FPGA, which is restricted by the relatively low operating frequency and slow off-chip global memory access, is able to catch-up with OpenMP at 2 14 agents and even outperform it for the 2 16 agent scenario in both Game-of-Life and Sugarscape ( Figure 7A-C). This is caused by the efficient neighbor search and the pipelining parallelism the FPGA implements.…”
Section: Methodsmentioning
confidence: 99%
“…We demonstrate the benefit of the online dispatcher as well as the co-execution capability using three simulation models: Circle, Crowd, and Traffic. The former two were simulated using 2 16 agents, while the more complex traffic simulation was populated by 2 13 agents. The merge_functions for Crowd and Traffic are defined as: combining the results from the path finding model and the social force model for Crowd and taking the result of the car-following model for Traffic.…”
Section: Online Dispatchermentioning
confidence: 99%
See 1 more Smart Citation
“…Meta Functions Li et al [11] as well as Diop et al [4] employ meta functions to determine data splits independently of the problem size. These meta functions have to be supplied by the programmer, describing the memory access patterns of the employed algorithms.…”
Section: Job Designmentioning
confidence: 99%
“…The technique we propose does not require training runs. Li et al [9] present STEPOCL, a tool which takes as input kernels along with a configuration file and generates automatically an OpenCL multi-devices application. The configuration file describes how to split data, the control flow of the program, and allow to have specialized kernels for different architectures.…”
Section: Related Workmentioning
confidence: 99%