2019
DOI: 10.1093/bioinformatics/btz219
|View full text |Cite
|
Sign up to set email alerts
|

Graph analysis of fragmented long-read bacterial genome assemblies

Abstract: Motivation Long-read genome assembly tools are expected to reconstruct bacterial genomes nearly perfectly; however, they still produce fragmented assemblies in some cases. It would be beneficial to understand whether these cases are intrinsically impossible to resolve, or if assemblers are at fault, implying that genomes could be refined or even finished with little to no additional experimental cost. Results We propose a set… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 35 publications
0
6
0
Order By: Relevance
“…For HybriSPAdes, an optimized k-mer size has to be determined to close gaps, depending on the read length and depth. We tested six different K-mer values (33, 55, 77, 99, 111 and 127) and used Bandage (https://github.com/rrwick/Bandage/wiki/Effect-of-kmer-size) to identify the best k-mer value (Marijon et al, 2019). HybridSPAdes was ran on R1 and R2 paired-end reads corrected with the short read error correction step (BayesHammer; Nikolenko et al, 2013), the accurate mode on, and the Mismatch correction step.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…For HybriSPAdes, an optimized k-mer size has to be determined to close gaps, depending on the read length and depth. We tested six different K-mer values (33, 55, 77, 99, 111 and 127) and used Bandage (https://github.com/rrwick/Bandage/wiki/Effect-of-kmer-size) to identify the best k-mer value (Marijon et al, 2019). HybridSPAdes was ran on R1 and R2 paired-end reads corrected with the short read error correction step (BayesHammer; Nikolenko et al, 2013), the accurate mode on, and the Mismatch correction step.…”
Section: Methodsmentioning
confidence: 99%
“…For HybriSPAdes, an optimized k-mer size has to be determined to close gaps, depending on the read length and depth. We tested six different K-mer values (33, 55, 77, 99, 111 and 127) and used Bandage (https://github.com/rrwick/Bandage/wiki/Effect-of-kmer-size) to identify the best k-mer value (Marijon et al, 2019). We choose the k-mer size producing the assembly graph with the lowest complexity (lowest number of nodes and edges and highest nodes N50), while checking that the graph was not broken into too much disconnected parts as suggested on the Bandage user’s manual (https://github.com/rrwick/Bandage/wiki/Effect-of-kmer-size).…”
Section: Methodsmentioning
confidence: 99%
“…They demonstrate that the method is able to error correct both long and ultra-long sequence reads and is highly scalable, as it is the only method that is able to scale to a human dataset containing ultra-long reads. Marijon et al. (2019) describe a method to analyze assembly graphs produced from long reads to recover contigs that were lost during the assembly process.…”
Section: Main Textmentioning
confidence: 99%
“…Although long reads greatly improved the contiguity of genome assemblies, resolving long repeats remains a challenging task. For example, the state-of-the-art long-read assemblers fail to fully assemble ~50% of bacterial genomes from the NCTC 3000 project aimed at sequencing 3000 bacterial genomes from England's National Collection of Type Cultures (Kamath et al, 2017, Marijon et al, 2019. Additionally, the base-pair accuracy of the long-read assemblies in the repeated regions is reduced (as compared to unique regions) since it is often unclear how to align reads to various repeat copies even if the repeat itself was bridged by some but not all reads.…”
Section: Introductionmentioning
confidence: 99%