2011 IEEE 19th Annual Symposium on High Performance Interconnects 2011
DOI: 10.1109/hoti.2011.19
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating the Potential of Cray Gemini Interconnect for PGAS Communication Runtime Systems

Abstract: The Cray Gemini Interconnect has been recently introduced as the next generation network for building scalable multi-petascale supercomputers. The Cray XE6 systems, which use the Gemini Interconnect are becoming available with Message Passing Interface (MPI) and Partitioned Global Address Space (PGAS) Models such as as Global Arrays, Unified Parallel C, Co-Array Fortran and Cascade High Performance Language. These PGAS models use one-sided communication runtime systems such as MPI-Remote Memory Access, Aggrega… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
7
0

Year Published

2012
2012
2016
2016

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(8 citation statements)
references
References 21 publications
(39 reference statements)
1
7
0
Order By: Relevance
“…Therefore, BigInt does not increase communication time significantly because the additional bits lead to a minor increase since packet headers and fixed (zero-load) delays in the network and NICs dominate for small transfers. Our results are confirmed by past work [21] which showed that the communication delay for messages with payloads containing double-word, quad-word, and double quad-word variables is identical and approximately 12 µs for up to 512 bytes, which is a larger payload size than BigInts (256 bytes).…”
Section: Performance Evaluationsupporting
confidence: 79%
See 1 more Smart Citation
“…Therefore, BigInt does not increase communication time significantly because the additional bits lead to a minor increase since packet headers and fixed (zero-load) delays in the network and NICs dominate for small transfers. Our results are confirmed by past work [21] which showed that the communication delay for messages with payloads containing double-word, quad-word, and double quad-word variables is identical and approximately 12 µs for up to 512 bytes, which is a larger payload size than BigInts (256 bytes).…”
Section: Performance Evaluationsupporting
confidence: 79%
“…This was previously possible only with double-precision variables due to the complex internal structures of arbitrary-precision libraries. Even though the size of a BigInt variable is 2101 bits-33× larger than a 64-bit double-precision variable-we observe insignificant loss in communication delay due to the fixed latency costs and packet overhead bytes in modern large-scale networks [21]. Therefore, BigInts readily apply to network operations and can be combined with past work on local-node computations that uses sorting and recursion or alternative wide fixedpoint representations with dedicated hardware support [22], [23], [15], in order to provide reproducible system-wide operations with no precision loss.…”
Section: Introductionmentioning
confidence: 99%
“…Compiler based approaches, such as OMPI [16], similarly Friedley and Lumsdaine describe a compiler approach, producing a 40% improvement, by exploiting onesided communication via transformation of MPI calls [4]. Similar work exploits one-sided communication within the Partitioned Global Address Space (PGAS) languages like Chapel [2], UPC [20], Global Arrays [14], and Co-Array FORTRAN [15] with some preliminary Gemini work using the DMAPP API [21].…”
Section: Performance Resultsmentioning
confidence: 99%
“…Vishnu et al [23], [24] present the implementation of the Aggregate Remote Memory Copy Interface (ARMCI) on Cray XE6 using DMAPP with relaxed ordering. Shan et al [25] present a performance evaluation of UPC and MPI benchmarks on Gemini and show applications using single-sided communication outperform those using twosided paradigm.…”
Section: B Architectures and Relaxed Orderingmentioning
confidence: 99%