System logs are useful to understand the status of and detect faults in large scale networks. However, due to their diversity and volume of these logs, log analysis requires much time and effort. In this paper, we propose a log event anomaly detection method for large-scale networks without pre-processing and feature extraction. The key idea is to embed a large amount of diverse data into hidden states by using latent variables. We evaluate our method with 12 months of system logs obtained from a nationwide academic network in Japan. Through comparisons with Kleinberg's univariate burst detection and a traditional multivariate analysis (i.e., PCA), we demonstrate that our proposed method achieves 14.5% higher recall and 3% higher precision than PCA. A case study shows detected anomalies are effective information for troubleshooting of network system faults.
Summary
One of the ways to analyze unstructured log messages from large‐scale IT systems is to classify log messages with log templates generated by template generation methods. However, there is currently no common knowledge pertained to the comparison and practical use of log template generation methods because they are implemented on the basis of diverse environments. To this end, we design and implement amulog, a general log analysis framework for comparing and combining diverse log template generation methods. Amulog consists of three key functions: (1) parsing log messages into headers and segmented messages, (2) classifying the log messages using a scalable template‐matching method, and (3) storing the structured data in a database. This framework helps us easily utilize time‐series data corresponding to the log templates for further analysis. We evaluate amulog with a log dataset collected from a nation‐wide academic network and demonstrate that it classifies the log data in a reasonable amount of time even with over 100,000 log template candidates. The template‐matching method in amulog also reduces 75% processing time for template generation and keeps the accuracy when combined with an existing structure‐based template generation method. In order to show the effectiveness of amulog in comparing log template generation methods, we demonstrate that the appropriate template generation methods and accuracy metrics largely depend on the purpose of further analysis by comparing the accuracy of six existing log template generation methods with 10 different accuracy metrics on amulog.
System logs are useful to understand the status of and detect faults in large scale networks. However, due to their diversity and volume of these logs, log analysis requires much time and effort. In this paper, we propose a log event anomaly detection method for largescale networks without pre-processing and feature extraction. The key idea is to embed a large amount of diverse data into hidden states by using latent variables. We evaluate our method with 15 months of system logs obtained from a nationwide academic network in Japan. Through comparisons with Kleinberg's univariate burst detection and a traditional multivariate analysis (i.e., PCA), we demonstrate that our proposed method detects anomalies and ease troubleshooting of network system faults.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.