2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing 2011
DOI: 10.1109/dasc.2011.27
|View full text |Cite
|
Sign up to set email alerts
|

Establishing Hypothesis for Recurrent System Failures from Cluster Log Files

Abstract: A goal for the analysis of supercomputer logs is to establish causal relationships among events which reflect significant state changes in the system. Establishing these relationships is at the heart of failure diagnosis. In principle, a log analysis tool could automate many of the manual steps systems administrators must currently use to diagnose system failures. However, supercomputer logs are unstructured, incomplete and contain considerable ambiguity so that direct discovery of causal relationships is diff… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2013
2013
2021
2021

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 23 publications
(19 reference statements)
0
7
0
Order By: Relevance
“…The outliers indicate particularly interesting events. Furthermore, causal analysis of event sequences, e.g., [3] complementing the association and correlation analysis, would be interesting as well, since it could directly provide more actionable knowledge.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…The outliers indicate particularly interesting events. Furthermore, causal analysis of event sequences, e.g., [3] complementing the association and correlation analysis, would be interesting as well, since it could directly provide more actionable knowledge.…”
Section: Discussionmentioning
confidence: 99%
“…If there is no other, later minimal occurrence [S A , E A ] ∈ µ α , where S A < S A , with the same property, then [S A , E B ] is a minimal occurrence of γ. For example, for the sequence (1, a), (2, b), (3, a), (4, c), (5, b), (6, c), (7, d) and the episodes α = (a, b, c) and β = (b, c, d), we get µ α = { [1,4], [3,6]}, µ β = { [5,7]}, γ = (a, b, c, d), and µ γ = { [3,7]}.…”
Section: -Episodesmentioning
confidence: 99%
See 2 more Smart Citations
“…ANCOR builds upon and significantly extends a previously described log analysis system, FDiag [15]. ANCOR evaluates multiple anomaly extraction algorithms.…”
Section: Introductionmentioning
confidence: 99%