2019
DOI: 10.1101/690438
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

uap: Reproducible and Robust HTS Data Analysis

Abstract: Background: A lack of reproducibility has been repeatedly criticized in computational research. High throughput sequencing (HTS) data analysis is a complex multi-step process. For most of the steps a range of bioinformatic tools is available and for most tools manifold parameters need to be set. Due to this complexity, HTS data analysis is particularly prone to reproducibility and consistency issues. We have defined four criteria that in our opinion ensure a minimal degree of reproducible research for HTS data… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
2

Relationship

2
0

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 37 publications
0
4
0
Order By: Relevance
“…Interestingly, the principal component analysis revealed only an insufficient separation of IL-12/IL-18 or a-CD3/a-CD28-stimulated MAIT cells, but a strong separation of the combined treatment, compared to unstimulated cells ( Figure 5A ). 2,570 proteins were jointly expressed in all 3 ways of activation, while a minor number of proteins ( 20 – 46 ) were selectively expressed after the different treatments ( Figure 5B ) . Analysis of significantly altered proteins revealed that more proteins were regulated in a-CD3/a-CD28 (TCR-dependently)- than IL-12/IL-18- (TCR-independently) activated cells ( Figure 5C ).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Interestingly, the principal component analysis revealed only an insufficient separation of IL-12/IL-18 or a-CD3/a-CD28-stimulated MAIT cells, but a strong separation of the combined treatment, compared to unstimulated cells ( Figure 5A ). 2,570 proteins were jointly expressed in all 3 ways of activation, while a minor number of proteins ( 20 – 46 ) were selectively expressed after the different treatments ( Figure 5B ) . Analysis of significantly altered proteins revealed that more proteins were regulated in a-CD3/a-CD28 (TCR-dependently)- than IL-12/IL-18- (TCR-independently) activated cells ( Figure 5C ).…”
Section: Resultsmentioning
confidence: 99%
“…The workflow management system uap ( 20 ) was used to transform the paired-end sequencing reads into a quantitative presence/absence table. The RNA-Seq workflow, which is included in the software, was customized to the requirements of this analysis: Fastq files were merged followed by a quality control with FastQC ( 21 ) and a quality filter using the FASTQ Quality Filter of the FASTX-toolkit ( 22 ).…”
Section: Methodsmentioning
confidence: 99%
“…One important constituent of reproducibility is reporting all versions, parameters, and data connections and linking analysis code and results. We provide a software for the analysis of omics data that supports reproducibility (Kämpf et al 2019). Additionally, full reproducibility also requires to preserve the computational environment that was used for the analysis (Grüning et al 2018).…”
Section: Discussionmentioning
confidence: 99%
“…Linking executable analysis code with intermediate and final resulting data is promoted as the gold standard of reproducibility by Peng (2011), while Sandve and colleagues suggest to follow the "ten simple rules for reproducible computational research" (Sandve et al 2013). Workflow management systems, like Galaxy, KNIME, or uap promote the use of reproducible research principles in bioinformatic analyses (Goecks et al 2010;Berthold et al 2007;Di Tommaso et al 2017;Kämpf et al 2019). None of the currently available integration methods specifically supports reproducible research principles.…”
Section: Multi-omics Data Analysismentioning
confidence: 99%