Proceedings of the ACM SIGMETRICS/international Conference on Measurement and Modeling of Computer Systems 2013
DOI: 10.1145/2465529.2465753
|View full text |Cite
|
Sign up to set email alerts
|

Root cause detection in a service-oriented architecture

Abstract: Large-scale websites are predominantly built as a service-oriented architecture. Here, services are specialized for a certain task, run on multiple machines, and communicate with each other to serve a user's request. An anomalous change in a metric of one service can propagate to other services during this communication, resulting in overall degradation of the request. As any such degradation is revenue impacting, maintaining correct functionality is of paramount concern: it is important to find the root cause… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
38
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 42 publications
(38 citation statements)
references
References 39 publications
0
38
0
Order By: Relevance
“…When a certain external factor causes an anomaly, the observed node affected by the external factor shows a highly correlated metric pattern in the anomaly time window. A pseudo-anomaly clustering algorithm [1] could be employed to solve such problems, and we will consider this part in future work.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…When a certain external factor causes an anomaly, the observed node affected by the external factor shows a highly correlated metric pattern in the anomaly time window. A pseudo-anomaly clustering algorithm [1] could be employed to solve such problems, and we will consider this part in future work.…”
Section: Discussionmentioning
confidence: 99%
“…Many research papers on RCA focus on complex large-scale systems [1][2][3][4]; they can be grouped into the following categories:…”
Section: Related Workmentioning
confidence: 99%
“…In addition, there are many metrics-based approaches [5], [9]- [13], [24], [25], as well as this work. These use metrics from applications and/or additional infrastructure levels to construct a causality graph that is used to infer root causes.…”
Section: Related Workmentioning
confidence: 99%
“…MonitorRank [13], Microscope [12] and CloudRanger [11] identify root causes based on application level metrics only. MonitorRank considers internal and external factors, and proposes a pseudo-anomaly clustering algorithm to classify external factors, then traverses the provided service call graph with a random walk algorithm to identify anomalous services.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation