2023
DOI: 10.26508/lsa.202201719
|View full text |Cite
|
Sign up to set email alerts
|

New algorithms for accurate and efficient de novo genome assembly from long DNA sequencing reads

Abstract: Building de novo genome assemblies for complex genomes is possible thanks to long-read DNA sequencing technologies. However, maximizing the quality of assemblies based on long reads is a challenging task that requires the development of specialized data analysis techniques. We present new algorithms for assembling long DNA sequencing reads from haploid and diploid organisms. The assembly algorithm builds an undirected graph with two vertices for each read based on minimizers selected by a hash function derived… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3

Relationship

2
1

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 41 publications
(78 reference statements)
0
2
0
Order By: Relevance
“…The reads were error-corrected using NECAT v0.0.1 correction step ( 29 ) with default parameters. Corrected reads were assembled, and the resulting assemblies were circularized using NGSEP Assembler v4.3.1 with default parameters ( 30 ). The genes used for circularization were rpoB for chromosome I and oriC for chromosome II.…”
Section: Methodsmentioning
confidence: 99%
“…The reads were error-corrected using NECAT v0.0.1 correction step ( 29 ) with default parameters. Corrected reads were assembled, and the resulting assemblies were circularized using NGSEP Assembler v4.3.1 with default parameters ( 30 ). The genes used for circularization were rpoB for chromosome I and oriC for chromosome II.…”
Section: Methodsmentioning
confidence: 99%
“…Hifiasm v0.12(r304) 80 was executed with the parameter “-n-hap” equal to 2 to assemble a diploid genome. The Assembler command of the Next Generation Sequencing Experience Platform (NGSEP) v4.3.1 was also executed using as parameters a k-mer length (-k) of 25, window length (-w) 40 and ploidy (-ploidy) of 2 81 . Contigs were aligned to the publicly available haploid genome assembly of the TcI Brazil A4 strain using Minimap2 v2.22 82 .…”
Section: Methodsmentioning
confidence: 99%