2015
DOI: 10.7287/peerj.preprints.1171v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

The impact of Docker containers on the performance of genomic pipelines

Abstract: Genomic pipelines consist of several pieces of third party software and, because their experimental nature, frequent changes and updates are commonly necessary thus raising serious distribution and reproducibility issues. Docker containers technology offers an ideal solution, as it allows the packaging of pipelines in an isolated and self-contained manner. This makes it easy to distribute and execute pipelines in a portable manner across a wide range of computing platforms. Thus the question that arises is to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
31
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
6
2
1

Relationship

1
8

Authors

Journals

citations
Cited by 28 publications
(31 citation statements)
references
References 0 publications
0
31
0
Order By: Relevance
“…It allows the seamless parallelization and deployment of any existing application with minimal development and maintenance overhead, irrespective of the original programming language. The built-in support for container technologies such as Docker 5 and Shifter 6 , along with the native integration with the Git tool and popular code-sharing platforms like GitHub, make it possible to precisely prototype self-contained computational workflows, maintain all variations over time and rapidly reproduce any former configuration one may need to re-use. These capabilities guarantee consistent results over time and across different computing platforms.…”
Section: Methodsmentioning
confidence: 99%
“…It allows the seamless parallelization and deployment of any existing application with minimal development and maintenance overhead, irrespective of the original programming language. The built-in support for container technologies such as Docker 5 and Shifter 6 , along with the native integration with the Git tool and popular code-sharing platforms like GitHub, make it possible to precisely prototype self-contained computational workflows, maintain all variations over time and rapidly reproduce any former configuration one may need to re-use. These capabilities guarantee consistent results over time and across different computing platforms.…”
Section: Methodsmentioning
confidence: 99%
“…Docker containers are shown to have better than, or equal performance as VMs. [19] Both forms of virtualization techniques introduce overhead in I/O-intensive workloads, especially in VMs, but introduce negligible CPU and memory overhead. For precision medicine pipelines the overhead of Docker containers will be negligible since these tend to be compute intensive and they typically run for several hours.…”
Section: Related Workmentioning
confidence: 99%
“…For precision medicine pipelines the overhead of Docker containers will be negligible since these tend to be compute intensive and they typically run for several hours. [19] Containers have also been proposed as a solution to improve experiment reproducibility, by ensuring that the data analysis tools are installed with the same responsibilities. [20]…”
Section: Related Workmentioning
confidence: 99%
“…The container virtualization technology, represented by Docker, enables users to create a software runtime environment isolated from the host machine [3]. This technology that is getting popular also in the biomedical research domain is a promising method to solve the problem of installing software tools [4]. Along with the containers, using workflow description and execution frameworks such as those from the Galaxy project [5] or the Common Workflow Language (CWL) project [6] lowered the barrier to deploy the data analysis environment to a new computing environment.…”
mentioning
confidence: 99%