Proceedings of the 47th International Conference on Parallel Processing 2018
DOI: 10.1145/3225058.3225145
|View full text |Cite
|
Sign up to set email alerts
|

A Generic Approach to Scheduling and Checkpointing Workflows

Abstract: Abstract:This work deals with scheduling and checkpointing strategies to execute scientific workflows on failure-prone large-scale platforms. To the best of our knowledge, this work is the first to target fail-stop errors for arbitrary workflows. Most previous work addresses soft errors, which corrupt the task being executed by a processor but do not cause the entire memory of that processor to be lost, contrarily to fail-stop errors. We revisit classical mapping heuristics such as HEFT and MinMin and compleme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
3

Relationship

4
3

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 40 publications
(49 reference statements)
0
9
0
Order By: Relevance
“…Finally, we compare our new general approach with P rop C kpt , the approach specific to M-SPGs that we proposed by Han et al (2018a). Figures 20 to 22 present this comparison for Montage, L igo and G enome , which are the three M-SPGs presented in the study by Han et al (2018a). Overall, the new approaches perform better than P rop C kpt .…”
Section: Methodsmentioning
confidence: 95%
See 1 more Smart Citation
“…Finally, we compare our new general approach with P rop C kpt , the approach specific to M-SPGs that we proposed by Han et al (2018a). Figures 20 to 22 present this comparison for Montage, L igo and G enome , which are the three M-SPGs presented in the study by Han et al (2018a). Overall, the new approaches perform better than P rop C kpt .…”
Section: Methodsmentioning
confidence: 95%
“…Authors' Note A preliminary version of this work appeared in the proceedings of ACM ICPP 2018. A shorter version of this work has been published in the proceedings of ICPP'18 (Han et al, 2018b).…”
mentioning
confidence: 99%
“…Recall that #P is the class of counting problems that correspond to NP decision problems [28], and that #P-complete problems are at least as hard as NP-complete problems. Several heuristics to decide which tasks to checkpoint are proposed and evaluated in [14].…”
Section: Checkpointingmentioning
confidence: 99%
“…In some studies, in addition to fail‐stops, silent errors are considered . In , disk checkpointing is combined with memory checkpointing.…”
Section: Related Studiesmentioning
confidence: 99%