2015
DOI: 10.1038/srep16780
|View full text |Cite
|
Sign up to set email alerts
|

The power of single molecule real-time sequencing technology in the de novo assembly of a eukaryotic genome

Abstract: Second-generation sequencers (SGS) have been game-changing, achieving cost-effective whole genome sequencing in many non-model organisms. However, a large portion of the genomes still remains unassembled. We reconstructed azuki bean (Vigna angularis) genome using single molecule real-time (SMRT) sequencing technology and achieved the best contiguity and coverage among currently assembled legume crops. The SMRT-based assembly produced 100 times longer contigs with 100 times smaller amount of gaps compared to th… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
54
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 71 publications
(58 citation statements)
references
References 62 publications
(80 reference statements)
2
54
0
Order By: Relevance
“…The homozygous variants amounted to 1,532 SNPs, 20,479 deletions and 6,549 insertions showing that the assembly had 99.99% base accuracy. The insertion-deletion (in-del) errors had outnumbered the substitution errors, similar to the results observed in PacBio-based Vigna angularis 20 and Oropetium thomaeum 21 genome assemblies, and were replaced with the Illumina sequence bases. Mitochondrial and chloroplast derived sequences were identified to be 1.15 Mb from 51 contigs and were removed.…”
Section: Resultssupporting
confidence: 59%
See 1 more Smart Citation
“…The homozygous variants amounted to 1,532 SNPs, 20,479 deletions and 6,549 insertions showing that the assembly had 99.99% base accuracy. The insertion-deletion (in-del) errors had outnumbered the substitution errors, similar to the results observed in PacBio-based Vigna angularis 20 and Oropetium thomaeum 21 genome assemblies, and were replaced with the Illumina sequence bases. Mitochondrial and chloroplast derived sequences were identified to be 1.15 Mb from 51 contigs and were removed.…”
Section: Resultssupporting
confidence: 59%
“…However, a better resolution of such repeats was obtained in Oropetium thomaeum 21 genome assembly, possibly owing to the 15-kb lower end insert size selection, explaining the importance of longer read lengths in obtaining near-perfect assemblies. The potential of PacBio sequence data in long, eukaryotic genomes has been further showcased in the draft genomes of Gorilla gorilla 45 (scaffold N50 of 23.1 Mb), V. angularis 20 (scaffold N50 of 3.0 Mb), O. thomaeum 21 (contig N50 of 2.4 Mb) and Lates calcarifer 46 (scaffold N50 of 1.19 Mb). A rapid increase in PacBio sequencing for similar large-scale assemblies can be expected in the near future.…”
Section: Discussionmentioning
confidence: 99%
“…However, the genome coverage rates of the assembly were low both in quinoa (73%) and P. veris (63%). Recently, near-complete genome assembly of a diploid species, Vigna angularis (azuki bean), was achieved using 27.6 Gbp of the PacBio long reads, which corresponds to 51× coverage of the genome size 68 . The authors constructed 2,529 scaffolds (N50 = 3.0 Mbp), covering 97.1% of the V. angularis genome.…”
Section: Resultsmentioning
confidence: 99%
“…In our analysis of quinoa, we attained a genome coverage of 30×. Given the complexity of the quinoa genome resulting from ploidy and large genome size, a much greater coverage than 51× of the genome size is needed to use the method reported by Sakai et al 68 Attaining a greater genome coverage is our next objective for refining our quinoa assembly. …”
Section: Resultsmentioning
confidence: 99%
“…It has been used for the sequencing of bacterial genomes such as the plant pathogen Xanthomonas oryzae [4]. PacBio reads have also been used for the sequencing of complex plant nuclear genomes, such as that of the Adzuki bean, Vigna angularis [5], demonstrating the advantage of this technology for resolving repetitive regions during sequence assembly.…”
Section: Introductionmentioning
confidence: 99%