2018
DOI: 10.7287/peerj.preprints.27141v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Approaches for containerized scientific workflows in cloud environments with applications in life science

Abstract: Containers are gaining popularity in life science research as they encompass all dependencies of provisioned tools and simplifies software installations for end users, as well as offering a form of isolation between processes. Scientific workflows are ideal to chain containers into data analysis pipelines to sustain reproducible science. In this manuscript we review the different approaches to use containers inside the workflow tools Nextflow, Galaxy, Pachyderm, Luigi, and SciPipe when deployed in cloud enviro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 7 publications
0
4
0
Order By: Relevance
“…Such a Docker runner would allow to follow an image-per-task execution pattern, where each task is executed using a different container image ( Spjuth et al, 2018 ). An example of this execution pattern can be found in the GenomeFastScreen pipeline ( https://sing-group.org/compihub/explore/5e2eaacce1138700316488c1 ), although in this case “docker run” commands are included in each task rather than provided in a runners file for the sake of simplicity.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Such a Docker runner would allow to follow an image-per-task execution pattern, where each task is executed using a different container image ( Spjuth et al, 2018 ). An example of this execution pattern can be found in the GenomeFastScreen pipeline ( https://sing-group.org/compihub/explore/5e2eaacce1138700316488c1 ), although in this case “docker run” commands are included in each task rather than provided in a runners file for the sake of simplicity.…”
Section: Resultsmentioning
confidence: 99%
“…As noted in the previous section, this is another way to deal with dependency management, as such Docker images contain all the dependencies required by the pipeline. Pipelines distributed in this way follow an image-per-pipeline execution pattern in which all tasks are executed using the same image container ( Spjuth et al, 2018 ) ( Fig. 5C ), and can even be run using Docker-compatible container technologies such as Singularity.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Increasing interest in workflow development systems that track data and software provenance, enable scalability and reproducibility, and re-entrant code ( Wratten et al 2021 ) have led to the development of several workflow languages, largely inspired by GNU Make ( Stallman and McGrath 1991 ; Köster and Rahmann 2012 ; Amstutz et al 2016 ). Nextflow is a Domain Specific Language ( Di Tommaso et al 2017 ) that currently leads workflow systems in terms of ease of scripting and submitting to cloud computing resources ( Fjukstad and Bongo 2017 ; Leipzig 2017 ; Spjuth et al 2020 ; Jackson et al 2021 ). A key benefit of Nextflow compared with earlier workflow languages is being able to submit jobs to a local machine, an HPC, or cloud-based compute environments.…”
mentioning
confidence: 99%