2019
DOI: 10.1101/840447
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

WENGAN: Efficient and high quality hybrid de novo assembly of human genomes

Abstract: The continuous improvement of long-read sequencing technologies along with the development of ad-doc algorithms has launched a new de novo assembly era that promises high-quality genomes.However, it has proven difficult to use only long reads to generate accurate genome assemblies of large, repeat-rich human genomes. To date, most of the human genomes assembled from long error-prone reads add accurate short reads to further polish the consensus quality. Here, we report the development of a novel algorithm for … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
13
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(13 citation statements)
references
References 48 publications
0
13
0
Order By: Relevance
“…We evaluated the performance of HASLR on both simulated and real datasets. We selected five hybrid assemblers: hybridSPAdes ( Antipov et al., 2015 ), Unicycler ( Wick et al., 2017 ), DBG2OLC ( Ye et al., 2016 ), MaSuRCA ( Zimin et al., 2017 ), Wengan ( Di Genova et al., 2019 ); four long read methods: Canu ( Koren et al., 2017 ), Flye ( Kolmogorov et al., 2019 ), wtdbg2 ( Ruan and Li, 2019 ), miniasm ( Li, 2016 ); and two short read methods: Minia ( Chikhi and Rizk, 2013 ), SPAdes ( Bankevich et al., 2012 ). All experiments were performed on isolated nodes of a cluster (i.e., no other simultaneous jobs were allowed on each node).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We evaluated the performance of HASLR on both simulated and real datasets. We selected five hybrid assemblers: hybridSPAdes ( Antipov et al., 2015 ), Unicycler ( Wick et al., 2017 ), DBG2OLC ( Ye et al., 2016 ), MaSuRCA ( Zimin et al., 2017 ), Wengan ( Di Genova et al., 2019 ); four long read methods: Canu ( Koren et al., 2017 ), Flye ( Kolmogorov et al., 2019 ), wtdbg2 ( Ruan and Li, 2019 ), miniasm ( Li, 2016 ); and two short read methods: Minia ( Chikhi and Rizk, 2013 ), SPAdes ( Bankevich et al., 2012 ). All experiments were performed on isolated nodes of a cluster (i.e., no other simultaneous jobs were allowed on each node).…”
Section: Resultsmentioning
confidence: 99%
“…(2016) , and Wang et al. (2018) for examples of such tools); (2) methods that first assemble raw LRs and then correct/polish the resulting draft assembly with SRs using polishing tools such as Pilon ( Walker et al., 2014 ) and Racon ( Vaser et al., 2017 ); and (3) methods that first assemble SRs and then utilize LRs to generate longer contigs (e.g., hybridSPAdes [ Antipov et al., 2015 ], Unicycler [ Wick et al., 2017 ], DBG2OLC [ Ye et al., 2016 ], and Wengan [ Di Genova et al., 2019 ]).…”
Section: Introductionmentioning
confidence: 99%
“…We evaluated the performance of HASLR on both simulated and real datasets. We selected five hybrid assemblers ( hybridSPAdes [1], Unicycler [32], DBG2OLC [34], Masurca [36] and Wengan [5]) as well as two non-hybrid methods (Canu [14] and wtdbg2 [25]). All experiments were performed on isolated nodes of a cluster (i.e.…”
Section: Resultsmentioning
confidence: 99%
“…PBcR [13] and Masurca [36]); (ii) methods that first assemble raw LRs and then correct/polish the resulting draft assembly with SRs using polishing tools such as Pilon [31] and Racon [29]; and (iii) methods that first assemble SRs and then utilize LRs to generate longer contigs (e.g. hybridSPAdes [1], Unicycler [32], DBG2OLC [34], and Wengan [5]).…”
Section: Introductionmentioning
confidence: 99%
“…This strategy was employed to produce a complete assembly of the human chromosome 8 by the T2T consortium [ 31 ]. Other strategies require no pedigree information for phasing and combine long reads with Hi-C [ 103 ] or single-cell strand sequencing data [ 104 ], or make use of several sequencing technologies [ 105 ]. Importantly, even if the genome size remains unaffected by the choice of an assembler or assembly parameters, the gene assembly can still be affected, especially when assembling highly heterozygous genomes [ 106 ].…”
Section: Long Readsmentioning
confidence: 99%