2013 IEEE International Conference on Cluster Computing (CLUSTER) 2013
DOI: 10.1109/cluster.2013.6702638
|View full text |Cite
|
Sign up to set email alerts
|

GGAS: Global GPU address spaces for efficient communication in heterogeneous clusters

Abstract: Modern GPUs are powerful high-core-count processors, which are no longer used solely for graphics applications, but are also employed to accelerate computationally intensive general-purpose tasks. For utmost performance, GPUs are distributed throughout the cluster to process parallel programs. In fact, many recent high-performance systems in the TOP500 list are heterogeneous architectures. Despite being highly effective processing units, GPUs on different hosts are incapable of communicating without assistance… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
9
0

Year Published

2014
2014
2018
2018

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 31 publications
(10 citation statements)
references
References 12 publications
1
9
0
Order By: Relevance
“…For larger messages, GGAS and MPI don't differ strongly, still GGAS is a little bit better. However, this corresponds to our previous results [1] that show that GGAS is performing best for small and medium size data transfers.…”
Section: Comparison With Mpisupporting
confidence: 91%
See 3 more Smart Citations
“…For larger messages, GGAS and MPI don't differ strongly, still GGAS is a little bit better. However, this corresponds to our previous results [1] that show that GGAS is performing best for small and medium size data transfers.…”
Section: Comparison With Mpisupporting
confidence: 91%
“…The second one is required after the broadcast operation. However, although GGAS allows fast synchronization between the GPUs [1], every synchronization requires additional data transfers and adds overhead to the application. Especially for small sizes, this synchronization overhead may surpass the data transfer latency.…”
Section: B Reduction With Remote Writesmentioning
confidence: 99%
See 2 more Smart Citations
“…Similarly MIC-RO [13] enables sharing and using multiple Intel Many Integrated Core (MIC), cards across nodes. The work in [14] proposed concepts that allow an HCA to access GPU memory similar to GDR. However, their concepts require specific hardware and cannot be applied to production ready HPC systems.…”
Section: Related Workmentioning
confidence: 99%