2016
DOI: 10.3389/fgene.2016.00075
|View full text |Cite
|
Sign up to set email alerts
|

Integrated Systems for NGS Data Management and Analysis: Open Issues and Available Solutions

Abstract: Next-generation sequencing (NGS) technologies have deeply changed our understanding of cellular processes by delivering an astonishing amount of data at affordable prices; nowadays, many biology laboratories have already accumulated a large number of sequenced samples. However, managing and analyzing these data poses new challenges, which may easily be underestimated by research groups devoid of IT and quantitative skills. In this perspective, we identify five issues that should be carefully addressed by resea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
36
0
2

Year Published

2016
2016
2023
2023

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 45 publications
(38 citation statements)
references
References 30 publications
0
36
0
2
Order By: Relevance
“…Their quality was evaluated and confirmed using the FastQC application: (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Pipelines for primary analysis (filtering and alignment to the reference genome of the raw reads) and secondary analysis (expression quantification, differential gene expression, and peak calling) have been integrated in the HTS‐flow system . Bioinformatic and statistical analysis were performed using R with Bioconductor and comEpiTools packages .…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Their quality was evaluated and confirmed using the FastQC application: (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Pipelines for primary analysis (filtering and alignment to the reference genome of the raw reads) and secondary analysis (expression quantification, differential gene expression, and peak calling) have been integrated in the HTS‐flow system . Bioinformatic and statistical analysis were performed using R with Bioconductor and comEpiTools packages .…”
Section: Methodsmentioning
confidence: 99%
“…bioinformatics.babraham.ac.uk/projects/fastqc/). Pipelines for primary analysis (filtering and alignment to the reference genome of the raw reads) and secondary analysis (expression quantification, differential gene expression, and peak calling) have been integrated in the HTS-flow system [52]. Bioinformatic and statistical analysis ª 2019 The Authors EMBO reports 20: e47987 | 2019…”
Section: Next Generation Sequencing Data Filtering and Quality Assessmentioning
confidence: 99%
“…Genome sequence data are strings of the four DNA nucleotide bases. Experimentally obtained, they are associated with a first range of metadata such as sequence quality, fragment length or methodology- oriented paired-ends sequencing (Bianchi et al, 2016). Subsequently, when sequences begin to be organized with the aim of specific biological understanding, we have long reads, contigs and also repeats, G+C content, tetranucleotide frequencies and a plethora of sequence descriptors [see (Weinel et al, 2002) as an example].…”
Section: Data Structuresmentioning
confidence: 99%
“…For instance, in the genomic domain, for Affymetrix time course data obtained from Affymetrix GeneChips, one may use Affymetrix software (MAS 5.0) and probe set algorithms of MAS5 for background subtraction, signal intensity normalization between arrays, and non-specific hybridization correction etc [75][76][77][78][79]. To do so, high level performance hardware and software (e.g., programming languages and algorithms for visualizations) that conduct parallel and distributed and cloud computing to manage, retrieve, reformat and analyze the data from various resources including the genomic laboratory and hospital patient information systems needs be considered (Table 1) [58,59,[80][81][82][83][84]. For instance, Bianchi et al [81] developed HTS-flow, a workflow management system that can retrieve information from a laboratory management system database, manages Omics data analyses through a simple GUI, outputs data in standard locations and allows the complete traceability of datasets, accompanying metadata and analysis scripts.…”
Section: Human Genomics/omics Application and Examplementioning
confidence: 99%
“…To do so, high level performance hardware and software (e.g., programming languages and algorithms for visualizations) that conduct parallel and distributed and cloud computing to manage, retrieve, reformat and analyze the data from various resources including the genomic laboratory and hospital patient information systems needs be considered (Table 1) [58,59,[80][81][82][83][84]. For instance, Bianchi et al [81] developed HTS-flow, a workflow management system that can retrieve information from a laboratory management system database, manages Omics data analyses through a simple GUI, outputs data in standard locations and allows the complete traceability of datasets, accompanying metadata and analysis scripts. Childs et al [82] designed and implemented SoFIA, an Omics data integration framework for annotating high throughput data sets [82].…”
Section: Human Genomics/omics Application and Examplementioning
confidence: 99%