2020
DOI: 10.1101/2020.05.05.079327
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Optimizing experimental design for genome sequencing and assembly with Oxford Nanopore Technologies

Abstract: words) 6High quality reference genome sequences are the core of modern genomics. Oxford Nanopore 7Technologies (ONT) produces inexpensive DNA sequences in excess of 100,000 nucleotides but 8 error rates remain >10% and assembling these sequences, particularly for eukaryotes, is a non-9 trivial problem. To date there has been no comprehensive attempt to generate experimental 10 design for ONT genome sequencing and assembly. Here, we simulate ONT and Illumina DNA 11 sequence reads for Escherichia coli, Caenorhab… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(6 citation statements)
references
References 58 publications
0
6
0
Order By: Relevance
“…We also measured BUSCO completeness with the Diptera_odb10 for the D. melanogaster assemblies, but found that, in some instances, our assembled sequence contained a greater proportion of conserved genes than the reference sequence. We chose to focus on the Metazoan_odb10 for D. melanogaster and present both sets of statistics in the additional files in the Gigascience database [35].…”
Section: Discussionmentioning
confidence: 99%
See 4 more Smart Citations
“…We also measured BUSCO completeness with the Diptera_odb10 for the D. melanogaster assemblies, but found that, in some instances, our assembled sequence contained a greater proportion of conserved genes than the reference sequence. We chose to focus on the Metazoan_odb10 for D. melanogaster and present both sets of statistics in the additional files in the Gigascience database [35].…”
Section: Discussionmentioning
confidence: 99%
“…The sequence read N50 indicates that 50% of the total sequenced nucleotides are in reads of that length or longer. Libraries and descriptive statistics including mean read lengths and qualities are available in the Gigascience database [35]. Using simulated libraries allowed us to study depth beyond the capability of single ONT flowcells, and to avoid possible errors created by combining libraries generated by different labs under different conditions.…”
Section: Simulated Sequence Librariesmentioning
confidence: 99%
See 3 more Smart Citations