2019
DOI: 10.1177/1094342019839124
|View full text |Cite
|
Sign up to set email alerts
|

Computational reproducibility of scientific workflows at extreme scales

Abstract: We propose an approach for improved reproducibility that includes capturing and relating provenance characteristics and performance metrics, in a hybrid queriable system, the ProvEn server. The system capabilities are illustrated on two use cases: scientific reproducibility of results in the ACME climate simulations and performance reproducibility in molecular dynamics workflows on HPC computing platforms.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 14 publications
(7 citation statements)
references
References 43 publications
(39 reference statements)
0
6
0
1
Order By: Relevance
“…These types of programmes are capable of building all necessary software in a managed way. Workflow management systems can capture detailed experiment provenance information for reproducibility and allow scientists to define the workflow of their computational experiments more robustly [89][90][91][92][93][94][95][96]. How to execute computational experiments is made clearer when using such systems.…”
Section: (A) Relationship To Prior Workmentioning
confidence: 99%
“…These types of programmes are capable of building all necessary software in a managed way. Workflow management systems can capture detailed experiment provenance information for reproducibility and allow scientists to define the workflow of their computational experiments more robustly [89][90][91][92][93][94][95][96]. How to execute computational experiments is made clearer when using such systems.…”
Section: (A) Relationship To Prior Workmentioning
confidence: 99%
“…Global summations run in parallel may be non-deterministic; larger problems are likely to lead to lower reproducibility due to additional threads and processors. Unfortunately, there are no community standards for acceptable reproducibility on exascale systems (Pouchard et al, 2019). Our implementation offers an option to improve reproducibility with a minimal impact to performance, allowing designers to better control the acceptable amount of reproducibility to performance ratios.…”
Section: Reproducibilitymentioning
confidence: 99%
“…Variables prefixed with l are the local terms. 512-bit vector would use vector (Pouchard et al, 2019), 8*double, and the for loop would increment by 8 for each iteration. The implementation is limited to where there is a Cþþ compiler.…”
Section: Kahan Algorithm Vectorizedmentioning
confidence: 99%
“…Reporting data provenance is crucial for building trust and traceability of data. By including metadata that describe the data production process, comprehensive documentation guarantees the preservation of data and methodologies, ensures repeatability, and supports analysis. …”
Section: Introductionmentioning
confidence: 99%
“…The need for usable, automated workflows has been recognized in many research fields, including MM. ,, ,,− Community standards are needed on how MM data should be reported and exchanged. Each data-generating process must be included as metadata to enable repeatability, replicability, and reproducibility to guarantee that users can reliably repeat a process under identical conditions.…”
Section: Introductionmentioning
confidence: 99%