2005
DOI: 10.1145/1084805.1084813
|View full text |Cite
|
Sign up to set email alerts
|

A notation and system for expressing and executing cleanly typed workflows on messy scientific data

Abstract: The description, composition, and execution of even logically simple scientific workflows are often complicated by the need to deal with "messy" issues like heterogeneous storage formats and ad-hoc file system structures. We show how these difficulties can be overcome via a typed, compositional workflow notation within which issues of physical representation are cleanly separated from logical typing, and by the implementation of this notation within the context of a powerful runtime system that supports distri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
54
0
4

Year Published

2006
2006
2015
2015

Publication Types

Select...
4
4
2

Relationship

2
8

Authors

Journals

citations
Cited by 88 publications
(58 citation statements)
references
References 11 publications
0
54
0
4
Order By: Relevance
“…SwiftScript [30][31], on the other hand, serves as a general purpose coordination language, where existing applications can be invoked without modification. We call this the "Black-Box" approach, in which we focus more on the input data and output data of each computing node, and the flow of the data.…”
Section: Language Challengementioning
confidence: 99%
“…SwiftScript [30][31], on the other hand, serves as a general purpose coordination language, where existing applications can be invoked without modification. We call this the "Black-Box" approach, in which we focus more on the input data and output data of each computing node, and the flow of the data.…”
Section: Language Challengementioning
confidence: 99%
“…The three DAGs are shown in Figure 6 and are: Montage [39] with 34 tasks, AIRSN [40] with 53 tasks, and LIGO [41] with 77 tasks. The method in [45] was adopted to model the heterogeneity of the mean task execution time estimate, which was randomly generated in the range of [1,100].…”
Section: A Settingsmentioning
confidence: 99%
“…We believe that to properly design a dataflow repository, you need a formal model for dataflows and runs. Although there are several dataflow specification languages [9,12,13,1,2], to our knowledge, none of them presents a formal model of repository storing dataflows and runs. With increasing importance of provenance [14,15,16,17], often with different interpretations for this term, it is essential that our model includes a formal definition of the kind of provenance that our work targets.…”
Section: Related Workmentioning
confidence: 99%