ACM/IEEE SC 2005 Conference (SC'05)
DOI: 10.1109/sc.2005.4
|View full text |Cite
|
Sign up to set email alerts
|

A Scalable Distributed Parallel Breadth-First Search Algorithm on BlueGene/L

Abstract: Many emerging large-scale data science applications require searching large graphs distributed across multiple memories and processors. This paper presents a distributed breadth-first search (BFS) scheme that scales for random graphs with up to three billion vertices and 30 billion edges. Scalability was tested on IBM BlueGene/L with 32,768 nodes at the Lawrence Livermore National Laboratory. Scalability was obtained through a series of optimizations, in particular, those that ensure scalable use of memory. We… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
158
0
1

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 188 publications
(159 citation statements)
references
References 19 publications
0
158
0
1
Order By: Relevance
“…Local discovery (per substep) (lines [11][12][13][14][15][16][17][18][19][20] Search for parents with the information available locally.…”
Section: Parallel and Distributed Bfs Algorithmmentioning
confidence: 99%
See 2 more Smart Citations
“…Local discovery (per substep) (lines [11][12][13][14][15][16][17][18][19][20] Search for parents with the information available locally.…”
Section: Parallel and Distributed Bfs Algorithmmentioning
confidence: 99%
“…Yoo [16] improves on this by employing block-cyclic distribution, eliminating the need for transpose vector at the cost of added code complexity. We adapt Yoo's method so that it becomes applicable to hybrid BFS ( Fig.…”
Section: Reducing Communication With Better Partitioningmentioning
confidence: 99%
See 1 more Smart Citation
“…There have been numerous implementations of parallel graph algorithms using various computer architectures, including distributed memory supercomputers [36], shared memory supercomputers [4], and multi-core SMP machines [21]. In the context of points-to analyses, the only parallel implementation we know of [25] has been discussed in depth in previous sections.…”
Section: Related Workmentioning
confidence: 99%
“…LLNL first demonstrated breadth-first search of a 3 × 10 9 node graph on the IBM BlueGene/L, the world's fastest supercomputer. 7 A random graph of this size is the largest that can fit in the machine's 32,768-node memory. Subsequently, LLNL processed a 10 10 -node scale-free graph using a very different approach and architecture.…”
Section: I/o-intensive Sparse Graph Analysismentioning
confidence: 99%