19th IEEE International Parallel and Distributed Processing Symposium
DOI: 10.1109/ipdps.2005.316
|View full text |Cite
|
Sign up to set email alerts
|

Optimizing Checkpoint Sizes in the C3 System

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 14 publications
(16 citation statements)
references
References 14 publications
0
16
0
Order By: Relevance
“…To checkpoint, prevailing software approaches [10,11,15,30,39,46] first impose a barrier on all threads, and then record their states (Figure 3(a)). A second barrier is used to ensure that a thread continues the execution only after all others have checkpointed, to prevent the states they are recording from being modified.…”
Section: Recovery From Global Exceptionsmentioning
confidence: 99%
See 2 more Smart Citations
“…To checkpoint, prevailing software approaches [10,11,15,30,39,46] first impose a barrier on all threads, and then record their states (Figure 3(a)). A second barrier is used to ensure that a thread continues the execution only after all others have checkpointed, to prevent the states they are recording from being modified.…”
Section: Recovery From Global Exceptionsmentioning
confidence: 99%
“…Upon exception, they recover to a prior error-free state and resume the program, losing all work completed since. A plethora of hardware [3,34,37,43] and software [10,11,15,27,30,39,46] approaches, striking trade-offs between complexity and overheads, have been proposed in the literature (Table 1: rows 1, 2). Our qualitative analysis shows that their checkpointing and recovery processes will be too inefficient to handle frequent exceptions.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Also, the checkpoints are taken at a time when the application memory footprint is small. Another approach proposed by Marques et al [28] dynamically partitions objects of the program into subheaps in memory. By specifying how the checkpoint mechanism treat objects in different subheaps as always save, never save and once save, they reduce the checkpoint size at runtime.…”
Section: Related Workmentioning
confidence: 99%
“…In our approach, a static analysis is done at compile time to compute information that can be fed to the runtime system to reduce the checkpointing overhead. In [21] we describe how we have added functions to our heap implementation that allows heap objects to be partitioned into "colors". There are additional functions for assigning checkpointing policies to each color (e.g., "Never save this color" or "Save this color only once").…”
Section: Automatic Application-level Checkpointingmentioning
confidence: 99%