2021
DOI: 10.22541/au.160443768.88917719/v2
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Cloud-Native Repositories for Big Scientific Data

Abstract: Scientific data has traditionally been distributed via downloads from data server to local computer. This way of working suffers from limitations as scientific datasets grow towards the petabyte scale. A “cloud-native data repository,” as defined in this paper, offers several advantages over traditional data repositories—performance, reliability, cost-effectiveness, collaboration, reproducibility, creativity, downstream impacts, and access & inclusion. These objectives motivate a set of best practices for … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
3
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 0 publications
0
3
0
Order By: Relevance
“…The nominal horizontal resolution of the ocean component of CM2.6 is 1/10 • , therefore resolving mesoscale eddies in many regions of the ocean (Hallberg, 2013). The data and tools for analysis were obtained from the Pangeo platform (Abernathey et al, 2021).…”
Section: Data For Training and Validationmentioning
confidence: 99%
“…The nominal horizontal resolution of the ocean component of CM2.6 is 1/10 • , therefore resolving mesoscale eddies in many regions of the ocean (Hallberg, 2013). The data and tools for analysis were obtained from the Pangeo platform (Abernathey et al, 2021).…”
Section: Data For Training and Validationmentioning
confidence: 99%
“…The nominal horizontal resolution of the ocean component of CM2.6 is 1/10 • , therefore resolving mesoscale eddies in many regions of the ocean (Hallberg, 2013). The data and tools for analysis were obtained from the Pangeo platform (Abernathey et al, 2021). The data used in the present work consists of the high-resolution simulated ocean surface velocity field u with components u (zonal) and v (meridional).…”
Section: Data For Training and Validationmentioning
confidence: 99%
“…Zarr 31,32 is a relatively new data storage format that is flexible, compressible, and designed to be accessed with open-source software using cloud or local computing resources. The work in Reference 33 presented webGlobe, a browser-based GIS framework for interacting with climate data and other datasets available in a similar format, with optimization facilities and AI capacities for data formats based on NetCDF.…”
mentioning
confidence: 99%