Fifth International Workshop on Systems and Network Telemetry and Analytics 2022
DOI: 10.1145/3526064.3534111
|View full text |Cite
|
Sign up to set email alerts
|

Studying Scientific Data Lifecycle in On-demand Distributed Storage Caches

Abstract: The XRootD system is used to transfer, store, and cache large datasets from high-energy physics (HEP). In this study we focus on its capability as distributed on-demand storage cache. Through exploring a large set of daily log files between 2020 and 2021, we seek to understand the data access patterns that might inform future cache design. Our study begins with a set of summary statistics regarding file read operations, file lifetimes, and file transfers. We observe that the number of read operations on each f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 12 publications
0
1
0
Order By: Relevance
“…Although this caching system is common enough in HEP computing environments, not often can it be found used directly by the analysis framework, whereas usually it is activated at the level of the grid site. Many efforts from literature are focused on analysing access patterns of certain datasets and evaluating different strategies to improve network and I/O usage [141,142,143].…”
Section: State Of the Artmentioning
confidence: 99%
“…Although this caching system is common enough in HEP computing environments, not often can it be found used directly by the analysis framework, whereas usually it is activated at the level of the grid site. Many efforts from literature are focused on analysing access patterns of certain datasets and evaluating different strategies to improve network and I/O usage [141,142,143].…”
Section: State Of the Artmentioning
confidence: 99%