Proceedings of the 18th Annual International Conference on Supercomputing 2004
DOI: 10.1145/1006209.1006235
|View full text |Cite
|
Sign up to set email alerts
|

Performance characteristics of the Cray X1 and their implications for application performance tuning

Abstract: During the last decade the scientific computing community has optimized many applications for execution on superscalar computing platforms. The recent arrival of the Japanese Earth Simulator has revived interest in vector architectures especially in the US. It is important to examine how to port our current scientific applications to the new vector platforms and how to achieve high performance. The success of porting these applications will also influence the acceptance of new vector architectures. In this pap… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2005
2005
2009
2009

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 12 publications
(9 reference statements)
0
6
0
Order By: Relevance
“…As a result, the systems can achieve a high sustained performance. However, although the performance of the whole system has been evaluated, the performance gain due to ECache is not quantitatively discussed [10] [11].…”
Section: Related Work 21 Cache Memory For Vector Processorsmentioning
confidence: 99%
See 1 more Smart Citation
“…As a result, the systems can achieve a high sustained performance. However, although the performance of the whole system has been evaluated, the performance gain due to ECache is not quantitatively discussed [10] [11].…”
Section: Related Work 21 Cache Memory For Vector Processorsmentioning
confidence: 99%
“…A cache memory for vector processors has been so far employed by modern vector processors such as Cray X1 and NEC SX-9 [3] [12]. Although the performance of vector supercomputers with a cache has already been reported [4][10] [11], the performance gain due to the cache is not quantitatively examined yet. To realize efficient large-scale simulations on the modern vector supercomputers, one important issue is about the optimization techniques to exploit the limited size of cache memory.…”
Section: Introductionmentioning
confidence: 99%
“…APEX-Map [2][3][4]22,23] is a tunable synthetic benchmark that measures global data access performance. It is designed based on parameterized concepts for temporal and spatial locality and generates a global data access stream according to specified levels of these measures of locality.…”
Section: Data Locality Concepts and Application Performance Charactermentioning
confidence: 99%
“…The X1 provides a large set of registers to reduce the number of memory accesses, reduce register spills, eliminate write-afterread dependencies and hide memory latency. The register set includes thirty-two 64-bit vector registers, 8 Fig. 1.…”
Section: Processing Node Architecturementioning
confidence: 99%
“…Cray X1 optimization strategies have been reported for a set of synthetic benchmarks [8]. In this paper, we conduct performance and scaling analysis of a scientific application called Parallel Ocean Program (POP), which was developed at the Los Alamos National Laboratory [13].…”
Section: Applicationmentioning
confidence: 99%