Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing 2019
DOI: 10.1145/3307681.3325400
|View full text |Cite
|
Sign up to set email alerts
|

Parsl

Abstract: High-level programming languages such as Python are increasingly used to provide intuitive interfaces to libraries written in lower-level languages and for assembling applications from various components. This migration towards orchestration rather than implementation, coupled with the growing need for parallel computing (e.g., due to big data and the end of Moore's law), necessitates rethinking how parallelism is expressed in programs. Here, we present Parsl, a parallel scripting library that augments Python … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
19
0
2

Year Published

2020
2020
2024
2024

Publication Types

Select...
9
1

Relationship

4
6

Authors

Journals

citations
Cited by 162 publications
(21 citation statements)
references
References 19 publications
0
19
0
2
Order By: Relevance
“…The first phase of the pipeline is integrated with the Advanced Photon Source (APS) Data Management System at the beamline, which deposits each newly acquired image into a Globus-accessible storage system at the APS. As new images are acquired, Globus Automate flows are launched to process them as follows: 1) moves new files from APS to Theta by using the Globus Transfer service ( 35 ); 2) performs DIALS stills_process ( 36 ) on batches of 256 images by using funcX ( 37 ), a function-as-a-service computation system [funcX uses Parsl ( 38 ) to abstract and acquire nodes on Theta as needed, and dispatches tasks to available nodes]; 3) extracts metadata from files regarding identified diffractions and generates visualizations (funcX) showing the locations of positive hits on the mesh; and 4) publishes raw data, metadata, and visualizations to a portal on the Argonne Leadership Computing Facility (ALCF) Petrel data system ( 39 ). The result of this automated process is an indexed, searchable data collection that provides full traceability from data acquisition to processed data that can be used to inspect and update the running experiment.…”
Section: Methodsmentioning
confidence: 99%
“…The first phase of the pipeline is integrated with the Advanced Photon Source (APS) Data Management System at the beamline, which deposits each newly acquired image into a Globus-accessible storage system at the APS. As new images are acquired, Globus Automate flows are launched to process them as follows: 1) moves new files from APS to Theta by using the Globus Transfer service ( 35 ); 2) performs DIALS stills_process ( 36 ) on batches of 256 images by using funcX ( 37 ), a function-as-a-service computation system [funcX uses Parsl ( 38 ) to abstract and acquire nodes on Theta as needed, and dispatches tasks to available nodes]; 3) extracts metadata from files regarding identified diffractions and generates visualizations (funcX) showing the locations of positive hits on the mesh; and 4) publishes raw data, metadata, and visualizations to a portal on the Argonne Leadership Computing Facility (ALCF) Petrel data system ( 39 ). The result of this automated process is an indexed, searchable data collection that provides full traceability from data acquisition to processed data that can be used to inspect and update the running experiment.…”
Section: Methodsmentioning
confidence: 99%
“…It provides an easy entry to cloud-based computing for biomolecular simulation scientists. Crossbow shares many of its design aspects with Parsl [23]. It provides tools to wrap Python functions and external applications (e.g., legacy MD simulation codes), in such a way that they can be combined into workflows using a task-based paradigm.…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, there are usually multiple computational activities in bioinformatics analyses including filtering, normalization, and annotation. Efforts to ensure reproducibility (Cohen-Boulakia et al, 2017) of these analyses involve (but are not limited to) task composition tools (scripts (Babuji et al, 2019), pipelines, scientific workflows (Liew et al, 2016), and software containers (Boettiger, 2015), web-based software platforms, such as Galaxy (Bedoya-Reina et al, 2013), commonly used applications, and source code available in repositories such as Github. We explore these issues in more detail in Section 4.6.…”
Section: Biodiversity Genomicsmentioning
confidence: 99%