2020
DOI: 10.1534/g3.119.400908
|View full text |Cite
|
Sign up to set email alerts
|

BlobToolKit – Interactive Quality Assessment of Genome Assemblies

Abstract: Reconstruction of target genomes from sequence data produced by instruments that are agnostic as to the species-of-origin may be confounded by contaminant DNA. Whether introduced during sample processing or through co-extraction alongside the target DNA, if insufficient care is taken during the assembly process, the final assembled genome may be a mixture of data from several species. Such assemblies can confound sequence-based biological inference and, when deposited in public databases, may be included in do… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
342
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4
1

Relationship

2
8

Authors

Journals

citations
Cited by 1,161 publications
(342 citation statements)
references
References 47 publications
0
342
0
Order By: Relevance
“…B. BlobToolKit (Challis et al, 2020) plot (GC proportion [x axis] vestus read coverage [y axis]) of the flye assembly of all the sequenced reads. Taxonomic annotation was achieved using blastx and Diamond tblastn source classification.…”
Section: Figure S1mentioning
confidence: 99%
“…B. BlobToolKit (Challis et al, 2020) plot (GC proportion [x axis] vestus read coverage [y axis]) of the flye assembly of all the sequenced reads. Taxonomic annotation was achieved using blastx and Diamond tblastn source classification.…”
Section: Figure S1mentioning
confidence: 99%
“…This includes identifying remaining vector and adapter contamination based on known sequence. Contaminating sequence can be detected with dedicated toolkits, such as BlobToolKit [18] or Anvi'o [19] or through individual sequence similarity searches using BLAST or Diamond against suitable databases (Table 1). Our in-house pipelines use automated detection of synthetic, laboratory and natural contaminants, but include manual controls to preserve sequences that may be the product of horizontal gene transfer (described below).…”
Section: Checking For Assembly Coherence Coverage and Contaminationmentioning
confidence: 99%
“…Finally, we used ncbi-blastn to query the contigs against the NCBI nr nucleotide database (-max_target_seqs 10 -max_hsps 1 -evalue 1e-25). We used the BlobToolKit pipeline (Challis, Richards, Rajan, Cochrane & Blaxter, 2020) to merge and plot the DIAMOND and BLAST taxonomic assignment with the assembly statistics (GC content and base coverage). Contigs explicitly assigned to anything else than metazoan were excluded from downstream analysis.…”
Section: Contamination Controlmentioning
confidence: 99%