2019 15th International Conference on eScience (eScience) 2019
DOI: 10.1109/escience.2019.00047
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Runtime Capture of Multiworkflow Data Using Provenance

Abstract: Computational Science and Engineering (CSE) projects are typically developed by multidisciplinary teams. Despite being part of the same project, each team manages its own workflows, using specific execution environments and data processing tools. Analyzing the data processed by all workflows globally is a core task in a CSE project. However, this analysis is hard because the data generated by these workflows are not integrated. In addition, since these workflows may take a long time to execute, data analysis n… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
74
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 18 publications
(74 citation statements)
references
References 20 publications
0
74
0
Order By: Relevance
“…Users can use online provenance analysis to monitor, debug or inspect the data transformations while they are still running (e.g., see the status, see how the intermediate results are evolving as the input parameters vary). The problem of adding low provenance data capture overhead is more challenging for provenance systems that allow for online analysis [17]. Queries Q3-Q5 exemplify queries that can be executed online, e.g., while a training process is running.…”
Section: Characterizing Provenance Analysis In ML For Csementioning
confidence: 99%
See 4 more Smart Citations
“…Users can use online provenance analysis to monitor, debug or inspect the data transformations while they are still running (e.g., see the status, see how the intermediate results are evolving as the input parameters vary). The problem of adding low provenance data capture overhead is more challenging for provenance systems that allow for online analysis [17]. Queries Q3-Q5 exemplify queries that can be executed online, e.g., while a training process is running.…”
Section: Characterizing Provenance Analysis In ML For Csementioning
confidence: 99%
“…Provenance tracking comprises provenance capture, the creation of the provenance relationships (e.g., associations between the processes and the consumed and generated data), and storage of the provenance data. In our view, provenance tracking systems that can be coupled to workflows [12,13,15,17] provide the flexibility needed in large-scale CSE projects, as opposed to moving workflows' executions and data to be managed by a single orchestration system, like a Workflow Management System. Workflow provenance capture systems usually address scripts as workflows with chained functions, method, or library calls that execute data transformations, while capturing input arguments and output values from these calls.…”
Section: Provlake In the ML Lifecycle In Csementioning
confidence: 99%
See 3 more Smart Citations