2015 IEEE 11th International Conference on E-Science 2015
DOI: 10.1109/escience.2015.40
|View full text |Cite
|
Sign up to set email alerts
|

dispel4py: An Agile Framework for Data-Intensive eScience

Abstract: We present dispel4py a versatile data-intensive kit presented as a standard Python library. It empowers scientists to experiment and test ideas using their familiar rapid-prototyping environment. It delivers mappings to diverse computing infrastructures, including cloud technologies, HPC architectures and specialised data-intensive machines, to move seamlessly into production with large-scale data loads. The mappings are fully automated, so that the encoded data analyses and data handling are completely unchan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
5
2
1

Relationship

3
5

Authors

Journals

citations
Cited by 17 publications
(25 citation statements)
references
References 22 publications
0
25
0
Order By: Relevance
“…Selection of metadata provenance model for representing Information concerning the creation, attribution, or version history of managed data. Depending on the complexity and accuracy of provenance information, different technologies can be used, spanning from PIDs for simple data producer citation, to more complex workflow management and tracking systems (Filgueira et al, 2015) for tracking information about full history of the processing chain.…”
Section: (Continued)mentioning
confidence: 99%
“…Selection of metadata provenance model for representing Information concerning the creation, attribution, or version history of managed data. Depending on the complexity and accuracy of provenance information, different technologies can be used, spanning from PIDs for simple data producer citation, to more complex workflow management and tracking systems (Filgueira et al, 2015) for tracking information about full history of the processing chain.…”
Section: (Continued)mentioning
confidence: 99%
“…The initial reference VM is set to some VM, say v 1 ∈ V . Then, it performs one sweep where one thread of each task, in topological order rooted at the source task(s), is mapped to a slot (lines [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23][24]. This mapping in the order of BFS traversal increases the chance that threads of adjacent tasks in the DAG are placed on the same VM to reduce network latency.…”
Section: R-storm Mapping (Rsm)mentioning
confidence: 99%
“…Rather than incrementally increase or decrease resource allocation and the mapping until the QoS stabilizes, a dynamic algorithm can make use of our model to converge to a stable configuration more rapidly. Our work is of particular use for enterprises and service providers who have a large class of infrastructure applications that are run frequently [35,58], or who reuse a library of common tasks when composing their applications [7,13,24], as is common in the scientific workflow community [15]. This amortizes the cost of building task-level performance models.…”
Section: Introductionmentioning
confidence: 99%
“…The Seismic Ambient Noise Cross-Correlation application was originally programmed as part of the VERCE project [16] using dispel4py as shown in Figure 3-both phases run on the same computing resource. With Asterism, we can distribute the execution among heterogeneous systems, leveraging their capabilities to efficiently run the applications.…”
Section: T2mentioning
confidence: 99%