2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2020
DOI: 10.1109/ipdps47924.2020.00022
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing High Performance Markov Clustering for Pre-Exascale Architectures

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6
3

Relationship

4
5

Authors

Journals

citations
Cited by 14 publications
(10 citation statements)
references
References 28 publications
(41 reference statements)
0
10
0
Order By: Relevance
“…Over the past decade, CombBLAS made significant progress in (a) developing new algorithms for sparse-matrix primitives [7], (b) implementing algorithms to extract highperformance from heterogeneous distributed systems with CPUs and GPUs [33], (c) demonstrating extreme-scalability using communication-avoiding algorithms that scale to the limit of supercomputers [36], [37], and (d) providing customized functionality for several high-impact applications in computational biology [38], [9] and scientific computing [17]. While many of these progresses have already been published separately, we show the overall impact of moving CombBLAS 1.0 to CombBLAS 2.0 and demonstrate how CombBLAS 2.0 made important progress toward exascale.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Over the past decade, CombBLAS made significant progress in (a) developing new algorithms for sparse-matrix primitives [7], (b) implementing algorithms to extract highperformance from heterogeneous distributed systems with CPUs and GPUs [33], (c) demonstrating extreme-scalability using communication-avoiding algorithms that scale to the limit of supercomputers [36], [37], and (d) providing customized functionality for several high-impact applications in computational biology [38], [9] and scientific computing [17]. While many of these progresses have already been published separately, we show the overall impact of moving CombBLAS 1.0 to CombBLAS 2.0 and demonstrate how CombBLAS 2.0 made important progress toward exascale.…”
Section: Resultsmentioning
confidence: 99%
“…Other optimizations include faster memory requirement estimation that involves approximate algorithms and a binary merge scheme that spreads out the computations related to merging of the partial results across stages of the SUMMA algorithm. A recent work [33] contains more information about these optimizations.…”
Section: Gpu Accelerationmentioning
confidence: 99%
“…A second thrust of ExaBiome involves protein clustering and annotation. ExaBiome's HipMCL [49] and PASTIS [50] code-the latter developed jointly with ExaGraph-provide a scalable protein clustering pipeline, whereas a new prototype deep learning framework [51] shows promising results for functional annotation. HipMCL runs on thousands of nodes and effectively uses GPUs.…”
Section: Exabiomementioning
confidence: 99%
“…For GPU-equipped clusters, we developed a model to choose the fastest GPU-based SpGEMM depending on the sparsity of the current MCL iteration and utilized a pipelined communication scheme that hides the cost of CPU-to-GPU data transfers. These advances, coupled with a distributed-memory implementation of randomized output structure prediction algorithm, resulted in orders of magnitude speedup compared to the original HipMCL (Selvitopi et al, 2020b).…”
Section: Algebraic Approaches For Graph Algorithms and Combinatorial Problemsmentioning
confidence: 99%