2016
DOI: 10.1038/sdata.2016.81
|View full text |Cite
|
Sign up to set email alerts
|

Next generation sequencing data of a defined microbial mock community

Abstract: Generating sequence data of a defined community composed of organisms with complete reference genomes is indispensable for the benchmarking of new genome sequence analysis methods, including assembly and binning tools. Moreover the validation of new sequencing library protocols and platforms to assess critical components such as sequencing errors and biases relies on such datasets. We here report the next generation metagenomic sequence data of a defined mock community (Mock Bacteria ARchaea Community; MBARC-2… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

2
119
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 97 publications
(122 citation statements)
references
References 28 publications
2
119
0
Order By: Relevance
“…Primer design for universal amplification of the V4-V5 region of 16S rDNA was based on Parada et al (2016). Resulting sequences were demultiplexed, and contaminating Illumina adaptor sequences were removed using the k-mer filter in BBDuk (v37.62) following Singer et al (2016). Briefly, BBDuk was used to remove reads containing more than 1 N base or with a quality score < 10 across the read or a length ≤ 51 bp or 33 % of the full length read.…”
Section: S Rrna Metagenomic Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…Primer design for universal amplification of the V4-V5 region of 16S rDNA was based on Parada et al (2016). Resulting sequences were demultiplexed, and contaminating Illumina adaptor sequences were removed using the k-mer filter in BBDuk (v37.62) following Singer et al (2016). Briefly, BBDuk was used to remove reads containing more than 1 N base or with a quality score < 10 across the read or a length ≤ 51 bp or 33 % of the full length read.…”
Section: S Rrna Metagenomic Analysismentioning
confidence: 99%
“…Briefly, BBDuk was used to remove reads containing more than 1 N base or with a quality score < 10 across the read or a length ≤ 51 bp or 33 % of the full length read. Additional processing using BBMap (http://bbtools.jgi.doe.gov, last access: 19 January 2018) mapped reads to masked human, cat, dog, and mouse references, discarding hits exceeding 93 % identity (Singer et al, 2016).…”
Section: S Rrna Metagenomic Analysismentioning
confidence: 99%
“…To validate the DAS Tool algorithm, we applied it to data from a synthetic microbial community that was constructed by mixing together DNA of 22 bacteria (including different species from the same genus) and 3 archaea 14 . We predicted bins using five binning tools (ABAWACA 1.07 (https://github.com/CK7/abawaca), CONCOCT 9 , MaxBin 2 11 , MetaBAT 10 and tetranucleotide ESOMs 4 ) and combined the result using DAS Tool.…”
Section: Das Tool Applied To a Synthetic Community Comprised Of A MIXmentioning
confidence: 99%
“…This is the theoretical limit for overlap-based clustering algorithms. The 163 second data set is two million Illumina short metagenome reads (150bp) sampled from a 164 mock microbial community consisting of 26 genomes described previously [34]. Clusters 165 are defined similarly as above for the PacBio transcriptome data set.…”
mentioning
confidence: 99%
“…34 Since current NGS technologies are not able to read the entire sequence of a genome 35 at once, genomes are broken into small DNA/RNA fragments followed by massive 36 parallel high-throughput sequencing. Different technologies produce sequence reads that 37 vary in length.…”
mentioning
confidence: 99%