An Overview of the BlueGene/L Supercomputer

Adiga, N.R.; Almási, George; Aridor, Yariv; Barik, Rajkishore; Beece, D.K.; Bellofatto, R.; Bhanot, Gyan; Bickford, R.; Blumrich, Matthias A.; Bright, A. A.; Brunheroto, José R.; Caşcaval, Călin; Castaños, José G.; Chan, Waiman; Ceze, Luís; Coteus, P.; Chatterjee, Siddhartha; Chen, D.; Chiu, George; Cipolla, T.M.; Crumley, P.; Desai, K.M.; Deutsch, A.; Domany, Tamar; Dombrowa, M. B.; Donath, W. E.; Eleftheriou, Maria; Erway, C. Christopher; Esch, J.; Fitch, Blake G.; Gagliano, Joseph A.; Gara, Alan; Garg, Rahul; Germain, Robert S.; Giampapa, Mark; Gopalsamy, B.; Gunnels, John A.; Gupta, Manish; Gustavson, Fred G.; Hall, Shawn A.; Haring, R.A.; Heidel, D.; Heidelberger, Philip; Herger, Lorraine M.; Hoenicke, D.; Jackson, R.D.; Jamaleddine, Tarek J.; Kopcsay, G.V.; Krevat, Elie; Kurhekar, Manish P.; Lanzetta, A.P.; Lieber, Derek; Liu, L.K.; Lu, Max; Mendell, M.; Misra, Archan; Moatti, Yosef; Mok, L.; Moreira, José E.; Nathanson, B.J.; Newton, M.; Ohmacht, Martin; Oliner, Adam J.; Pandit, Vinayaka; Pudota, R.B.; Rand, R.; Regan, R.; Rubin, B.J.; Ruehli, Albert E.; Rus, Silvius; Sahoo, Ramendra K.; Sanomiya, A.; Schenfeld, Eugen; Sharma, Mayank; Shmueli, Edi; Singh, S. N.; Song, Peilin; Srinivasan, Vijayalakshmi; Steinmacher-Burow, Burkhard; Strauß, Karin; Surovic, C.W.; Swetz, R.; Takken, Todd; Tremaine, R. B.; Tsao, M.; Umamaheshwaran, A.R.; Verma, Puja; Vranas, P.; Ward, Todd J.; Wazlowski, M.; Barrett, W.; Engel, C. D.; Drehmel, B.; Hilgart, B.; Hill, D. N.; Kasemkhani, F.; Krolak, D.; Li, C.T.; Liebsch, T.; Marcella, J. A.; Muff, A.; Okomo, A.; Rouse, M.; Schram, A.; Tubbs, M.; Ulsh, G.; Wait, C. D.; Wittrup, J.; Bae, M.M.; Dockser, K.; Kissel, Lynn; Seager, Mark; Vetter, Jeffrey S.; Yates, K.

doi:10.1109/sc.2002.10017

Cited by 168 publications

(183 citation statements)

References 8 publications

Supporting

Mentioning

179

Contrasting

Unclassified

Order By: Relevance

“…We conducted our experiments on a 2048-node BlueGene/L (BG/L) machine [3]. Each node has only 1GB of memory, which restricts application problem sizes.…”

Section: Experimental Frameworkmentioning

confidence: 99%

ScalaTrace: Scalable compression and replay of communication traces for high-performance computing

Noeth

Ratn

Mueller

et al. 2009

Journal of Parallel and Distributed Computing

114

111

View full text Add to dashboard Cite

Characterizing the communication behavior of large-scale applications is a difficult and costly task due to code/system complexity and long execution times. While many tools to study this behavior have been developed, these approaches either aggregate information in a lossy way through high-level statistics or produce huge trace files that are hard to handle.We contribute an approach that provides orders of magnitude smaller, if not near-constant size, communication traces regardless of the number of nodes while preserving structural information. We introduce intra-and inter-node compression techniques of MPI events that are capable of extracting an application's communication structure. We further present a replay mechanism for the traces generated by our approach and discuss results of our implementation for BlueGene/L. Given this novel capability, we discuss its impact on communication tuning and beyond. To the best of our knowledge, such a concise representation of MPI traces in a scalable manner combined with deterministic MPI call replay are without any precedent.Key words: High-Performance Computing, Scalability, Communication Tracing PACS: 07.05.Bx An earlier version of this paper appeared at IPDPS'07 [20]. This journal version extends the earlier paper by novel domain-specific intra-and inter-node compression techniques, a completely redesigned inter-node merge algorithm, novel results with a larger class of codes resulting in near-constant trace sizes, a study to identify the timestep loop and extended related work.

show abstract

“…We conducted our experiments on a 2048-node BlueGene/L (BG/L) machine [3]. Each node has only 1GB of memory, which restricts application problem sizes.…”

Section: Experimental Frameworkmentioning

confidence: 99%

ScalaTrace: Scalable compression and replay of communication traces for high-performance computing

Noeth

Ratn

Mueller

et al. 2009

Journal of Parallel and Distributed Computing

114

111

View full text Add to dashboard Cite

show abstract

“…The first is to design and develop high performance clusters with consideration of energy consumption. BlueGene/L [10,11] is designed with system-on-chip technology to reduce power in processors and network links. Green Destiny [12] consists of 240 Transmeta processors which consume low power.…”

Section: Related Workmentioning

confidence: 99%

Power Aware Scheduling of Bag-of-Tasks Applications with Deadline Constraints on DVS-enabled Clusters

Kim

Buyya

Kim

2007

Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid '07)

252

155

View full text Add to dashboard Cite

show abstract

“…Blue Gene systems offer native support for broadcast and multicast communications [1,2] using a dedicated tree-like network which may not be adequate for the kind of application supported by SpiNNaker. As discussed, Blue Gene supercomputers or others, such as the Cray XT supercomputers [32], being general-purpose will provide solutions that do not match the power-efficiency of SpiNNaker.…”

Section: Related Workmentioning

confidence: 99%

Analytical Assessment of the Suitability of Multicast Communications for the SpiNNaker Neuromimetic System

Navaridas

Luj'n

Plana

et al. 2012

2012 IEEE 14th International Conference on High Performance Computing and Communication &Amp; 2012 IEEE 9th International Confe

View full text Add to dashboard Cite

Abstract-SpiNNaker is a custom-made architecture designed to model large-scale spiking neural nets. One of the most significant characteristics of neural nets is their extreme communication needs; each neuron propagates its activation to thousands of other neurons. This paper shows analytical proof that the novel multicast router in SpiNNaker is a better solution for simulating neural nets than more powerful point-to-point routers such as those found on datacentres or high performance computing systems, even when it has significantly lower requirements in terms of complexity, area and power. First, we characterised the utilization of resources required by both multicast and unicast networks. Then we derived the bandwidth needs of different communication architectures. Finally, we derived the amount of neurons the different networks can support. From these analyses we determined that the multicast communications adopted in SpiNNaker will be able to support the target application under the expected operation conditions.

show abstract

An Overview of the BlueGene/L Supercomputer

Cited by 168 publications

References 8 publications

ScalaTrace: Scalable compression and replay of communication traces for high-performance computing

ScalaTrace: Scalable compression and replay of communication traces for high-performance computing

Power Aware Scheduling of Bag-of-Tasks Applications with Deadline Constraints on DVS-enabled Clusters

Analytical Assessment of the Suitability of Multicast Communications for the SpiNNaker Neuromimetic System

Contact Info

Product

Resources

About