2005
DOI: 10.1109/clustr.2005.347039
|View full text |Cite
|
Sign up to set email alerts
|

Transparent Checkpoint-Restart of Distributed Applications on Commodity Clusters

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
48
0

Year Published

2008
2008
2020
2020

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 49 publications
(49 citation statements)
references
References 12 publications
1
48
0
Order By: Relevance
“…We point out that distributed snapshot algorithms have long been proposed and applied [24,13,11,18,25,23] and thus are not our contribution. The contribution of VNsnap is the application of a classic snapshot algorithm to the emerging virtualized environments, as well as the proof of its applicability.…”
Section: Overviewmentioning
confidence: 99%
See 1 more Smart Citation
“…We point out that distributed snapshot algorithms have long been proposed and applied [24,13,11,18,25,23] and thus are not our contribution. The contribution of VNsnap is the application of a classic snapshot algorithm to the emerging virtualized environments, as well as the proof of its applicability.…”
Section: Overviewmentioning
confidence: 99%
“…ZapC [18] is a thin virtualization layer that provides checkpoint/restart functionality for a self-contained virtual machine abstraction, namely a pod (PrOcess Domain), that contains a group of processes. Due to the smaller checkpointing granularity (a pod vs. a VM), ZapC is more efficient than VNsnap in checkpointing a group of processes.…”
Section: Related Workmentioning
confidence: 99%
“…This is especially problematic in the field of high performance computing where applications typically are memory demanding. Research has determined that the memory footprint is a major contributor to the checkpoint image size [7,20]. Further, due to the ever-increasing system size and complexity [4], failures occur more frequently than before, thereby making restart latency a critical concern in networked environments.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, to support execution rollback, applications are placed inside the Zap [19,15] virtual execution environment, while RP code is injected using Dyninst [4]. Zap is a considerably complex component that is tightly coupled with the Linux kernel, and requires maintenance along with the operating system (OS).…”
Section: Introductionmentioning
confidence: 99%