Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of 2019
DOI: 10.1145/3338906.3338916
|View full text |Cite
|
Sign up to set email alerts
|

How bad can a bug get? an empirical analysis of software failures in the OpenStack cloud computing platform

Abstract: Cloud management systems provide abstractions and APIs for programmatically configuring cloud infrastructures. Unfortunately, residual software bugs in these systems can potentially lead to highseverity failures, such as prolonged outages and data losses. In this paper, we investigate the impact of failures in the context widespread OpenStack cloud management system, by performing fault injection and by analyzing the impact of the resulting failures in terms of fail-stop behavior, failure detection through log… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
43
0
1

Year Published

2021
2021
2022
2022

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 54 publications
(45 citation statements)
references
References 57 publications
1
43
0
1
Order By: Relevance
“…Therefore, the Chaos Monkey randomly terminates VM instances and containers that run inside a production environment. The principles of the testing tool have inspired developers to implement similar tools for different technologies, for example, Kubernetes clusters, 12 Azure Service Fabric, 13 Docker, 14 or private cloud infrastructures. 15 Following the above ideas, we have developed a tool for monkey testing our selfhealing, trans-cloud application management platform.…”
Section: Monkey Testingmentioning
confidence: 99%
See 2 more Smart Citations
“…Therefore, the Chaos Monkey randomly terminates VM instances and containers that run inside a production environment. The principles of the testing tool have inspired developers to implement similar tools for different technologies, for example, Kubernetes clusters, 12 Azure Service Fabric, 13 Docker, 14 or private cloud infrastructures. 15 Following the above ideas, we have developed a tool for monkey testing our selfhealing, trans-cloud application management platform.…”
Section: Monkey Testingmentioning
confidence: 99%
“…As a result, depending on the actual deployment of multicomponent applications, we may have different durations for their possible "instability periods" (viz., time periods during which some of their application components are left unstable). Instability periods are definitely an issue, as inconsistent answers may cause inconsistent states for an application [13]. Furthermore, unresponsiveness increases the latency in answering to end-users, potentially causing client loss in the same way as underprovisioning does [3].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, advancements in microprocessor manufacturing, which yield lower nodal capacitance and higher transistor density, cause an increase in the soft error rate [21], i.e., the probability of occurrence of a soft error. Software faults are also a threat to the dependability of cloud computing and virtualized systems, particularly software faults in the components required for virtualization (e.g., hypervisor, toolstack, privileged virtual machine) and cloud management [22].…”
Section: Related Workmentioning
confidence: 99%
“…If intermediate code compilation exists, byte code manipulation may also be a viable option (Sanches et al 2011). In many cases, we are able to use abstract forms of the code (e.g., an abstract syntax tree) to inject a particular kind of fault (Cotroneo et al 2019;Hajdu et al 2020).…”
Section: Introductionmentioning
confidence: 99%