2017
DOI: 10.1101/gr.216465.116
|View full text |Cite
|
Sign up to set email alerts
|

HINGE: long-read assembly achieves optimal repeat resolution

Abstract: Long-read sequencing technologies have the potential to produce gold-standard de novo genome assemblies, but fully exploiting error-prone reads to resolve repeats remains a challenge. Aggressive approaches to repeat resolution often produce misassemblies, and conservative approaches lead to unnecessary fragmentation. We present HINGE, an assembler that seeks to achieve optimal repeat resolution by distinguishing repeats that can be resolved given the data from those that cannot. This is accomplished by adding … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

1
82
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
3
3

Relationship

1
9

Authors

Journals

citations
Cited by 93 publications
(84 citation statements)
references
References 28 publications
1
82
0
Order By: Relevance
“…Eight draft assemblies were generated, six of which were produced with CANU v1.3 (Berlin et al, 2015;Koren et al, 2017), one with FALCON v0.7.3 (Chin et al, 2016) and one with ABRUIJN v0.4 . HINGE v0.41 (Kamath et al, 2017) was also tested on this dataset, but at that time the tool required the entire alignment file (over 2 Tb) to fit in primary memory and we did not have the computational resources to handle it. CANU v1.3 was run with different settings for the error correction stage on the entire dataset of~6 M reads (two CANU runs were optimized for highly repetitive genomes).…”
Section: Pacific Biosciences Sequencing Pacific Biosciences Readsmentioning
confidence: 99%
“…Eight draft assemblies were generated, six of which were produced with CANU v1.3 (Berlin et al, 2015;Koren et al, 2017), one with FALCON v0.7.3 (Chin et al, 2016) and one with ABRUIJN v0.4 . HINGE v0.41 (Kamath et al, 2017) was also tested on this dataset, but at that time the tool required the entire alignment file (over 2 Tb) to fit in primary memory and we did not have the computational resources to handle it. CANU v1.3 was run with different settings for the error correction stage on the entire dataset of~6 M reads (two CANU runs were optimized for highly repetitive genomes).…”
Section: Pacific Biosciences Sequencing Pacific Biosciences Readsmentioning
confidence: 99%
“…However, the massive read lengths and increased error rate of these new technologies have also required updated assembly methods. This issue includes three new assembly tools designed specifically for long-read PacBio and Nanopore data: Canu , HINGE (Kamath et al 2017), and Racon (Vaser et al 2017).…”
mentioning
confidence: 99%
“…In contrast to second-generation technologies, producing reads reaching lengths of a few hundreds base pairs, they allow the sequencing of much longer reads (10 kbp on average (Sedlazeck et al, 2018b), and up to >1 million bps (Jain et al, 2018)). These long reads are expected to solve various problems, such as contig and haplotype assembly (Patterson et al, 2015;Kamath et al, 2017), scaffolding (Cao et al, 2017), and structural variant calling (Sedlazeck et al, 2018a). However, they are very noisy.…”
Section: Introductionmentioning
confidence: 99%