2015
DOI: 10.1145/2788396
|View full text |Cite
|
Sign up to set email alerts
|

A Survey of CPU-GPU Heterogeneous Computing Techniques

Abstract: As both CPUs and GPUs become employed in a wide range of applications, it has been acknowledged that both of these Processing Units (PUs) have their unique features and strengths and hence, CPU-GPU collaboration is inevitable to achieve high-performance computing. This has motivated a significant amount of research on heterogeneous computing techniques, along with the design of CPU-GPU fused chips and petascale heterogeneous supercomputers. In this article, we survey Heterogeneous Computing Techniques (HCTs) s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
196
0
3

Year Published

2016
2016
2018
2018

Publication Types

Select...
7
1
1

Relationship

2
7

Authors

Journals

citations
Cited by 424 publications
(214 citation statements)
references
References 155 publications
1
196
0
3
Order By: Relevance
“…While the word has been loosely used earlier, this study has provided guidance on how we should think about components, and thus, this definition will be followed for the rest of the discussion in this section. Fine-grained parallelism approaches, such as those surveyed in Mittal and Vetter (2015), may be applied within a component as so defined, but are likely to fail above that level. A substantial increase in overall scalability of an ESM may be achieved if several components are run concurrently.…”
Section: Discussionmentioning
confidence: 99%
“…While the word has been loosely used earlier, this study has provided guidance on how we should think about components, and thus, this definition will be followed for the rest of the discussion in this section. Fine-grained parallelism approaches, such as those surveyed in Mittal and Vetter (2015), may be applied within a component as so defined, but are likely to fail above that level. A substantial increase in overall scalability of an ESM may be achieved if several components are run concurrently.…”
Section: Discussionmentioning
confidence: 99%
“…OpenCL (Open Computing Language) is a recent standard, which is ratified by the Khronos Group, for cross-platform parallel programming with diverse processors [14]. OpenCL is welcomed for its portability, but it cannot achieve the highest possible performance for its high-level abstraction [15]. Brook [16] is an extension to the C-language for stream programming that was originally developed by Stanford University; Brook+ is an implementation of the Brook GPU specification on AMD's compute abstraction layer.…”
Section: Heterogeneous Computingmentioning
confidence: 99%
“…A scheduling strategy should consider both the internal characteristics of target algorithms and the external hardware attributes of the underlying PUs to determine suitable task partitioning and allocation. Currently, Many research focuses on algorithm-level workload partitioning and scheduling [15]. Workload partitioning techniques have been designed based on the relative performance of PUs [20,24], the nature of subtasks [25], or other partitioning criteria for different algorithms and applications.…”
Section: Heterogeneous Computingmentioning
confidence: 99%
“…Similarly, in compute-intensive applications, while utilizing the accelerating device, the host CPUs remain idle, which leads to waste of energy and performance. Approaches that intelligently manage the resources of host CPUs and accelerating devices to address such inefficiencies seem promising [68]. To achieve higher performance, scalability and energy efficiency, engineers often combine Central Processing Units (CPUs), Graphical Processing Units (GPUs), or Field Programmable Gate Arrays (FPGAs).…”
Section: Introductionmentioning
confidence: 99%