2019 IEEE/ACM 41st International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) 2019
DOI: 10.1109/icse-seip.2019.00021
|View full text |Cite
|
Sign up to set email alerts
|

Tools and Benchmarks for Automated Log Parsing

Abstract: Logs are imperative in the development and maintenance process of many software systems. They record detailed runtime information that allows developers and support engineers to monitor their systems and dissect anomalous behaviors and errors. The increasing scale and complexity of modern software systems, however, make the volume of logs explodes. In many cases, the traditional way of manual log inspection becomes impractical. Many recent studies, as well as industrial tools, resort to powerful text search an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
285
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 340 publications
(313 citation statements)
references
References 41 publications
1
285
0
Order By: Relevance
“…Therefore, template extraction from software logs is the most widely-applicable approach and thus logzip proposes an iterative clustering algorithm to extract templates from logs automatically. According to the benchmark by Zhu et al [17], existing template extraction approaches could perform accurately on software logs. However, these methods require all the historical logs as input, leading to severe inefficiency and hindering them from adoption in practice.…”
Section: A Overviewmentioning
confidence: 99%
See 3 more Smart Citations
“…Therefore, template extraction from software logs is the most widely-applicable approach and thus logzip proposes an iterative clustering algorithm to extract templates from logs automatically. According to the benchmark by Zhu et al [17], existing template extraction approaches could perform accurately on software logs. However, these methods require all the historical logs as input, leading to severe inefficiency and hindering them from adoption in practice.…”
Section: A Overviewmentioning
confidence: 99%
“…LKE and IPLoM are offline parsers, and SHISO and Drain could parse online in a streaming manner. These parsers are evaluated and compared in the benchmark by Zhu et al [17]. The parsers could extract hidden structures but they take all logs as input thus are not efficient compared with the proposed ISE.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…The log data set BlueGene/L (BGL) is a public, partially tagged data set from IBM's renowned high-performance computing lab (Lawrence Livermore National Labs, LLNL). The BGL data set [33,34] contains 4,747,963 raw log messages of 215 days, with a size of 708 M. HPC data set [33,34] is a high-performance cluster log collected by Los Alamos National Labs, containing 433,490 raw log messages. HDFS data set [33,34] is a log data set collected from the 203 node cluster of the Amazon EC2 platform.…”
Section: Experimental Environment and Data Setsmentioning
confidence: 99%