2021
DOI: 10.48550/arxiv.2106.14423
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Operational Data Analytics in Practice: Experiences from Design to Deployment in Production HPC Environments

Abstract: As HPC systems grow in complexity, efficient and manageable operation is increasingly critical. Many centers are thus starting to explore the use of Operational Data Analytics (ODA) techniques, which extract knowledge from massive amounts of monitoring data and use it for control and visualization purposes. As ODA is a multifaceted problem, much effort has gone into researching its separate aspects: however, accounts of production ODA experiences are still hard to come across.In this work we aim to bridge the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 26 publications
(49 reference statements)
0
1
0
Order By: Relevance
“…); ii) facility managers, concerned with system-wide issues, such as energy consumption, mortgage costs, and thermal/cooling problems [14]; iii) system accountants, who need to manage and keep track of the different accounts and projects, granting access to users, and providing reports and statistics on the usage of the machines [15]; iv) HPC users, industrial and academic partners who submit jobs, and are typically interested in fast completion times and fair prices [16]. Finally, the diversity of roles implies different use-cases, which often require ad-hoc analytics solutions [17].…”
Section: Iiot and Hpc Systemsmentioning
confidence: 99%
“…); ii) facility managers, concerned with system-wide issues, such as energy consumption, mortgage costs, and thermal/cooling problems [14]; iii) system accountants, who need to manage and keep track of the different accounts and projects, granting access to users, and providing reports and statistics on the usage of the machines [15]; iv) HPC users, industrial and academic partners who submit jobs, and are typically interested in fast completion times and fair prices [16]. Finally, the diversity of roles implies different use-cases, which often require ad-hoc analytics solutions [17].…”
Section: Iiot and Hpc Systemsmentioning
confidence: 99%