2014
DOI: 10.1007/978-3-642-54420-0_71
|View full text |Cite
|
Sign up to set email alerts
|

Using Performance Tools to Support Experiments in HPC Resilience

Abstract: Abstract. The high performance computing (HPC) community is working to address fault tolerance and resilience concerns for current and future large scale computing platforms. This is driving enhancements in the programming environments, specifically research on enhancing message passing libraries to support fault tolerant computing capabilities. The community has also recognized that tools for resilience experimentation are greatly lacking. However, we argue that there are several parallels between "performanc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 11 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?