Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing 2016
DOI: 10.1145/2912152.2912157
|View full text |Cite
|
Sign up to set email alerts
|

Persistent Data Staging Services for Data Intensive In-situ Scientific Workflows

Abstract: Scientific simulation workflows executing on very large scale computing systems are essential modalities for scientific investigation. The increasing scales and resolution of these simulations provide new opportunities for accurately modeling complex natural and engineered phenomena. However, the increasing complexity necessitates managing, transporting, and processing unprecedented amounts of data, and as a result, researchers are increasingly exploring data-staging and in-situ workflows to reduce data moveme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
0
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 23 publications
0
0
0
Order By: Relevance
“…Programming models: Existing big data toolkits (e.g., Hadoop [12], Spark [28], AllPairs [19], DataSpaces [25], etc) already provide an extensive collection of ready-to-use functionalities. It is critical that students understand the underlying programming paradigms implemented in these functionalities.…”
Section: Learning Activitiesmentioning
confidence: 99%
“…Programming models: Existing big data toolkits (e.g., Hadoop [12], Spark [28], AllPairs [19], DataSpaces [25], etc) already provide an extensive collection of ready-to-use functionalities. It is critical that students understand the underlying programming paradigms implemented in these functionalities.…”
Section: Learning Activitiesmentioning
confidence: 99%
“…A common practice is to use a separate or external parallel computer to prepare data for subsequent processing, but this strategy not only limits the amount of data that can be saved, but also turns I/O into a performance bottleneck when using a large parallel system. The most plausible solution for the exascale data problem is to reduce or transform the data in-situ [17] to perform subsequent processing locally or even while it is being generated.…”
Section: In-situ Processingmentioning
confidence: 99%