2012 13th International Conference on Parallel and Distributed Computing, Applications and Technologies 2012
DOI: 10.1109/pdcat.2012.31
|View full text |Cite
|
Sign up to set email alerts
|

A Synchronization-Induced Checkpoint Protocol for Group-Synchronous Parallel Programs

Abstract: Group checkpointing is a fix between global checkpointing and log-based recovery. It features both reduced runtime overhead and localized recovery effect for improving the fault-tolerance performance of large-scale distributed systems. However, parallel programs cannot efficiently benefit from this strategy, as they often involve synchronous or semisynchronous interactions that incur extra idling delays between processes as well as between process groups. This paper presents an analytical study on such delays … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2015
2015
2015
2015

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
references
References 15 publications
0
0
0
Order By: Relevance