2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN) 2016
DOI: 10.1109/dsn.2016.17
|View full text |Cite
|
Sign up to set email alerts
|

Process-Oriented Non-intrusive Recovery for Sporadic Operations on Cloud

Abstract: Cloud-based systems get changed more frequently than traditional systems. These frequent changes involve sporadic operations such as installation and upgrade. Sporadic operations may fail due to the uncertainty of cloud platforms. Each sporadic operation manipulates a number of cloud resources. The accessibility of resources manipulated makes it possible to build an accurate process model of the correct behavior for an operation and its desired effects. This paper proposes a non-intrusive recovery approach for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2019
2019
2019
2019

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 18 publications
0
1
0
Order By: Relevance
“…Another important approach to mitigate failures is to implement fault containment strategies. Examples are i) interrupting a service as soon as a failure occurs (i.e., a fail-stop behavior), by turning high-severity failures, such as data losses, into lower-severity API exceptions that can be gracefully be handled [5,57,71]; ii) notifying the cloud management system and operators about the failures through error logs, so that they can diagnose issues and undertake recovery actions, such as restoring a previous state checkpoint or backup [19,75]; iii) separating system components across different domains to prevent cascading failures across components [2,26,34].…”
Section: Introductionmentioning
confidence: 99%
“…Another important approach to mitigate failures is to implement fault containment strategies. Examples are i) interrupting a service as soon as a failure occurs (i.e., a fail-stop behavior), by turning high-severity failures, such as data losses, into lower-severity API exceptions that can be gracefully be handled [5,57,71]; ii) notifying the cloud management system and operators about the failures through error logs, so that they can diagnose issues and undertake recovery actions, such as restoring a previous state checkpoint or backup [19,75]; iii) separating system components across different domains to prevent cascading failures across components [2,26,34].…”
Section: Introductionmentioning
confidence: 99%