2018 IEEE International Conference on Cluster Computing (CLUSTER) 2018
DOI: 10.1109/cluster.2018.00073
|View full text |Cite
|
Sign up to set email alerts
|

A Big Data Analytics Framework for HPC Log Data: Three Case Studies Using the Titan Supercomputer Log

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 19 publications
0
7
0
Order By: Relevance
“…Furthermore, the resource allocation vicinity will be further studied via analyzing jobs' information that are executed across multiple nodes. Performing similar analyses on publicly available system logs, such as those from the Failure Trace Archive (FTA) 5 or the Computer Failure Data Repository (CFDR) 6 , as well as other types of monitoring data, such as node power consumption, are also planned as future work.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Furthermore, the resource allocation vicinity will be further studied via analyzing jobs' information that are executed across multiple nodes. Performing similar analyses on publicly available system logs, such as those from the Failure Trace Archive (FTA) 5 or the Computer Failure Data Repository (CFDR) 6 , as well as other types of monitoring data, such as node power consumption, are also planned as future work.…”
Section: Discussionmentioning
confidence: 99%
“…For example, points A : (1, 10) and B : (4, 6) in a 2D Cartesian representation are separated by a distance of 4 − 1 = 3 on the X axis and 10−6 = 4 on the Y axis, respectively. Defining the new dimension Z, according to a common (but so far unseen) feature of A and B would result in a 3D representation of A : (1, 10, 5) and B : (4,6,5). Here '5' denotes that common feature.…”
Section: A Vicinitiesmentioning
confidence: 99%
See 1 more Smart Citation
“…In [45], the authors integrated probabilistic analysis with an optimized K-means algorithm to detect error propagation across the nodes in a HPC system. In [46], the authors presented a big-data analytics framework that mines event patterns and provides user application and system event correlations. In [47], the authors presented a scalable, intuitive HPC data analysis framework.…”
Section: Related Workmentioning
confidence: 99%
“…More existing literature of user authentication techniques and mechanisms used in distributed HPC systems can be found in [4,14,28]. 5 EAI As machine learning has been widely spread into different domains, a more direct and effective strategy has come into people's mind to defend HPC systems, log file-based defense [24,29,30], which is a behavior detection method based on log files generated from running jobs. Using log files has many advantages than previous methods, as log files are a detailed record of the running process of the applications.…”
Section: Intrusion Detectionmentioning
confidence: 99%