2013
DOI: 10.1002/cpe.3194
|View full text |Cite
|
Sign up to set email alerts
|

Communication and computation optimization of concurrent kernels using kernel coalesce on a GPU

Abstract: General purpose computation on graphics processing unit (GPU) is rapidly entering into various scientific and engineering fields. Many applications are being ported onto GPUs for better performance. Various optimizations, frameworks, and tools are being developed for effective programming of GPU. As part of communication and computation optimizations for GPUs, this paper proposes and implements an optimization method called as kernel coalesce that further enhances GPU performance and also optimizes CPU to GPU … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
4
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
4
1
1

Relationship

3
3

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 28 publications
0
4
0
Order By: Relevance
“…This tool not only combines all the memory transfers of multiple kernels to be sent to GPU but also optimizes the performance if the kernels use the shared data. Bayyapu et al [15] also emphasized the importance of CPU to GPU communication time and proposed a tool named kernel coalesce to reduce the number of kernel calls by merging concurrent kernels and optimized the GPU execution time by increasing the device utilization and by reducing the number of memory accesses for shared data among kernels.…”
Section: Related Workmentioning
confidence: 97%
“…This tool not only combines all the memory transfers of multiple kernels to be sent to GPU but also optimizes the performance if the kernels use the shared data. Bayyapu et al [15] also emphasized the importance of CPU to GPU communication time and proposed a tool named kernel coalesce to reduce the number of kernel calls by merging concurrent kernels and optimized the GPU execution time by increasing the device utilization and by reducing the number of memory accesses for shared data among kernels.…”
Section: Related Workmentioning
confidence: 97%
“…A Composite format that is a combination of ELL and COO (Co-ordinate format), named hybrid(HYB) is proposed by Bell and Garland [8] and a combination of ELL and CSR (Compressed Sparse Row) formats is proposed by Anirudh et al [9] and Kiran and Kishore [10]. Further, CPU to GPU communication time importance and optimizations were given by various researchers ( [13]- [17]). …”
Section: Related Workmentioning
confidence: 99%
“…Otherwise, it will not design robust programs. Literature [5] studies the concurrency problem from the perspective of network programs, pointing out that the concurrency problem is an intrinsically complex problem, which makes traditional programming methods encounter many difficulties in the development of high-quality (concurrent) programs, and seriously affects the development efficiency. The impact of concurrency problems on the complexity of (concurrent) program development can be likened to the "software concurrency crisis" caused by multi-core processors.…”
Section: Introductionmentioning
confidence: 99%