SC18: International Conference for High Performance Computing, Networking, Storage and Analysis 2018
DOI: 10.1109/sc.2018.00077
|View full text |Cite
|
Sign up to set email alerts
|

A Year in the Life of a Parallel File System

Abstract: I/O performance is a critical aspect of data-intensive scientific computing. We seek to advance the state of the practice in understanding and diagnosing I/O performance issues through investigation of a comprehensive I/O performance data set that captures a full year of production storage activity at two leadership-scale computing facilities. We demonstrate techniques to identify regions of interest, perform focused investigations of both long-term trends and transient anomalies, and uncover the contributing … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
3
3
2

Relationship

3
5

Authors

Journals

citations
Cited by 42 publications
(20 citation statements)
references
References 24 publications
0
20
0
Order By: Relevance
“…In the tests, we increase the number of clients from 4 to 32 for both the models. We run the tests with Darshan-3.1.7, which is a well-established tool for HPC I/O tracing and characterization [20,21,37,38]. From the Darshan logs, we collect the file access patterns posed by the ImageNet data reader module in This phenomenon makes the application read less amount of data while the metadata overhead for file reading remains the same.…”
Section: Lbann Benchmarksmentioning
confidence: 99%
See 2 more Smart Citations
“…In the tests, we increase the number of clients from 4 to 32 for both the models. We run the tests with Darshan-3.1.7, which is a well-established tool for HPC I/O tracing and characterization [20,21,37,38]. From the Darshan logs, we collect the file access patterns posed by the ImageNet data reader module in This phenomenon makes the application read less amount of data while the metadata overhead for file reading remains the same.…”
Section: Lbann Benchmarksmentioning
confidence: 99%
“…Besides, there has been another case study by Yu et al [50] on the performance of Lustre over Quadrics and InfiniBand using sequential and parallel I/O, metadata and application benchmarks. While quite a number of recent studies [33,36,40,[46][47][48][49]52] have demonstrated different techniques to evaluate the I/O performance of PFSs in supercomputing facilities, many other studies [34,39,43] have emphasized on PFS, application and data API tuning and optimization to acquire better I/O throughput on HPC systems. Although many research attempts have been taken for I/O performance analysis of different PFSs, there is a lack of understanding on the characteristics of BeeGFS I/O and metadata performance, particularly its capability of handling the workloads posed by DL applications.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…This framework includes access to privileged monitoring information from I/O subsystems and vendor APIs, but it also makes use of application data collected by using Darshan. By continuously monitoring a data center over time, one can, for example, detect performance regressions [24]. TOKIO specifically.…”
Section: Workflow Engines For Automation In Hpc Environmentsmentioning
confidence: 99%
“…A large body of knowledge exists on data motion between compute and storage from the perspectives of both applications and systems [6]- [11] because this form of I/O blocks forward progress on running jobs. However, data-driven workflows rely on data movement not only between compute and storage systems, but also between storage systems and between storage and external networks as depicted in Figure 1.…”
Section: Introductionmentioning
confidence: 99%