Proceedings of the Joint ACM SIGSOFT Conference -- QoSA and ACM SIGSOFT Symposium -- ISARCS on Quality of Software Architecture 2011
DOI: 10.1145/2000259.2000289
|View full text |Cite
|
Sign up to set email alerts
|

Architecture-based fault tolerance support for grid applications

Abstract: Failure in long running grid applications is arguably inevitable and costly. Therefore, fault tolerance (FT) support for grid applications is needed. This paper evaluates an extension of our prior work on Recovery Aware Components (RAC), a component-based FT approach. Our extension utilizes the grid application architecture according to a small number of architectural classes. In this paper, we evaluate the MapReduce architecture only and analyze the reliability improvement MapReduce applications would gain by… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 14 publications
0
2
0
Order By: Relevance
“…Our cost estimation is inspired by Valiant's bulk synchronous-parallel model [45] of parallel computing where global strong synchronisation conservatively approximates systems which may in reality use more fine grained synchronisation and indeed may allow for more asynchrony than the above approximation would suggest. In performance benchmarks reported in [48], Yusuf et al demonstrated that such conservative predictions may still be accurate enough if there is enough WCET variation and a large enough number of activities/tasks scheduled on individual processing elements. Thus adjacent modes may be assumed to be strongly separated in the global model while in fact such modes are partially interleaved with respect each other (subject to restrictions on repetition such as boundedness for message sequence graphs as described by Alur [3] and star-connectivity in trace languages).…”
Section: Proposed Visualisation Approachmentioning
confidence: 99%
“…Our cost estimation is inspired by Valiant's bulk synchronous-parallel model [45] of parallel computing where global strong synchronisation conservatively approximates systems which may in reality use more fine grained synchronisation and indeed may allow for more asynchrony than the above approximation would suggest. In performance benchmarks reported in [48], Yusuf et al demonstrated that such conservative predictions may still be accurate enough if there is enough WCET variation and a large enough number of activities/tasks scheduled on individual processing elements. Thus adjacent modes may be assumed to be strongly separated in the global model while in fact such modes are partially interleaved with respect each other (subject to restrictions on repetition such as boundedness for message sequence graphs as described by Alur [3] and star-connectivity in trace languages).…”
Section: Proposed Visualisation Approachmentioning
confidence: 99%
“…When the oracle does not exist or is extremely difficult to apply (namely the oracle problem in the context of testing), the applicability and effectiveness of a fault tolerance strategy will be significantly affected. One way of addressing the oracle problem within the context of fault tolerance is by using assertions to detect the failures due to the violation of certain properties (such as, out of range, incorrect data type, etc) [14].…”
Section: Introductionmentioning
confidence: 99%