2019
DOI: 10.3389/fmicb.2019.01560
|View full text |Cite
|
Sign up to set email alerts
|

Using QC-Blind for Quality Control and Contamination Screening of Bacteria DNA Sequencing Data Without Reference Genome

Abstract: Quality control for next generation sequencing (NGS) has become increasingly important with the ever increasing importance of sequencing data for omics studies. Tools have been developed for filtering possible contaminants from species with known reference genome. Unfortunately, reference genomes for all the species involved, including the contaminants, are required for these tools to work. This precludes many real-life samples that have no information about the complete genome of the target species, and are c… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4

Relationship

1
7

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 44 publications
0
6
0
1
Order By: Relevance
“…Weltevreden strains. Potential interspecies and cross-species contamination in the whole genome sequence data was checked with ConFindr ( 44 ) and QC-Blind ( 45 ). Raw reads with low quality base pairs at each terminal (Quality-Value < 20), and/or those with a short length (parameter setting at 50 bp), or > 15 bp overlap with Illumina TruSeq adapter sequences (parameter setting at 15 bp) were removed.…”
Section: Methodsmentioning
confidence: 99%
“…Weltevreden strains. Potential interspecies and cross-species contamination in the whole genome sequence data was checked with ConFindr ( 44 ) and QC-Blind ( 45 ). Raw reads with low quality base pairs at each terminal (Quality-Value < 20), and/or those with a short length (parameter setting at 50 bp), or > 15 bp overlap with Illumina TruSeq adapter sequences (parameter setting at 15 bp) were removed.…”
Section: Methodsmentioning
confidence: 99%
“…Genomic and metagenomic sequencing data commonly contain possible contamination from various environments, yet identification and removal of these contaminants remain difficult [126] , [127] , [128] . Machine learning-enabled source tracking and sequence clustering methods can act together to identify and remove contaminants, regardless of known or unknown sources [128] . Indeed, the application of machine learning methods to sequencing data can lead to the removal of most known contaminants [129] .…”
Section: Applications In Microbial Dark Matter Analysismentioning
confidence: 99%
“…At present, the robust taxonomic affiliation of microorganisms is based on the extraction, sequencing, and analysis of their genome, partial or total, supported by morphological, physiological, and metabolic traits (Figure 2). For this, sequencing high-quality DNA of the microorganism of interest and validating the obtained data through a quality control process are decisive factors, because reads of DNA without the required quality can be excluded in subsequent analysis (Xi et al, 2019). FastQC is a digital tool designed to perform numerous quality control checks of the sequences that are obtained (Andrews, 2010).…”
Section: Fully Bilingualmentioning
confidence: 99%
“…En la actualidad, la afiliación taxonómica robusta de microorganismos se basa en la extracción, secuenciación y análisis de su genoma, parcial o total, apoyado de rasgos morfológicos, fisiológicos y metabólicos (Figura 2). Para esto, es determinante la secuenciación del ADN de alta calidad del microorganismo de interés y validar los datos obtenidos por un proceso de control de calidad, lo cual permite la exclusión de lecturas de ADN sin la calidad necesaria para análisis posteriores (Xi et al, 2019). FastQC es una herramienta digital que permite realizar numerosas comprobaciones de control de calidad de las secuencias obtenidas (Andrews, 2010).…”
Section: Estudios Genómicos Para La Afiliación Taxonómica De Acb De Ounclassified