2005
DOI: 10.1147/rd.492.0425
|View full text |Cite
|
Sign up to set email alerts
|

Resource allocation and utilization in the Blue Gene/L supercomputer

Abstract: This paper describes partition allocation for parallel jobs in the Blue Genet/L supercomputer. It describes the novel network architecture of the Blue Gene/L (BG/L) three-dimensional (3D) computational core and presents a preliminary analysis of its properties and advantages compared those of with more traditional systems. The scalability challenge is solved in BG/L by sacrificing granularity of system management. The system is treated as a collection of composite allocation units that contain both processing … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
17
0

Year Published

2007
2007
2014
2014

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 32 publications
(17 citation statements)
references
References 15 publications
(20 reference statements)
0
17
0
Order By: Relevance
“…The obvious way to go is to introduce contiguous allocation strategies in schedulers for parallel machines. In some other papers addressing this issue [3,9,10,12,23] allocation algorithms were proposed mainly for k-ary n-cube topologies. Figures of merit usually did not show how placement strategies affect the runtime of an application instance, but just the completion time of a list of jobs.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The obvious way to go is to introduce contiguous allocation strategies in schedulers for parallel machines. In some other papers addressing this issue [3,9,10,12,23] allocation algorithms were proposed mainly for k-ary n-cube topologies. Figures of merit usually did not show how placement strategies affect the runtime of an application instance, but just the completion time of a list of jobs.…”
Section: Related Workmentioning
confidence: 99%
“…distance between nodes is considered as the difference between node identifiers. The most notable example of current supercomputer that tries to maintain locality when allocating resources is the BlueGene family (3D tori), whose scheduler [3] puts tasks from the same application in one or more midplanes of 8x4x4 nodes.…”
Section: Related Workmentioning
confidence: 99%
“…The difference is that this work implicitly assumes that the entire machine is devoted to a single job. (This occurs for capability jobs, but is also the norm on BlueGene systems, which guarantees each job its own submesh, which is kept isolated from other jobs [5].) For example, Yu et al [31] devise strategies based on folding one mesh into another with a minimum of dilation (stretching a communication graph edge across multiple communication links).…”
Section: Motivation and Related Workmentioning
confidence: 99%
“…Some of them provide methods for the system administrator to develop their own strategies but, in practice, this is rarely done. To our knowledge, the only two current schedulers that maintain the locality are the one used by the BlueGene family supercomputers [9] and SLURM. The BlueGene scheduler puts tasks from the same application in one or more midplanes of 8x8x8 nodes which decreases network contention and allows locality exploitation.…”
Section: Related Workmentioning
confidence: 99%