2019
DOI: 10.1371/journal.pone.0216885
|View full text |Cite
|
Sign up to set email alerts
|

gapFinisher: A reliable gap filling pipeline for SSPACE-LongRead scaffolder output

Abstract: Unknown sequences, or gaps, are present in many published genomes across public databases. Gap filling is an important finishing step in de novo genome assembly, especially in large genomes. The gap filling problem is nontrivial and while there are many computational tools partially solving the problem, several have shortcomings as to the reliability and correctness of the output, i.e. the gap filled draft genome. SSPACE-LongRead is a scaffolding tool that utilizes long reads from multiple third-generation seq… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
11
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 16 publications
(11 citation statements)
references
References 32 publications
0
11
0
Order By: Relevance
“…Haplotigs and either extremely high or extremely low coverage contigs were removed (using purge haplotigs (a-70)) (Roach, Schmidt, and Borneman 2018). Remaining contigs were upgraded using FinisherSC (Lam et al 2015) and scaffolded using iterative rounds of SSPACE_longreads (Boetzer and Pirovano 2014) and gapFinisher (Kammonen et al 2019): Round 1 (error corrected reads, -k 1 -o 1000 -l 10 -g 500); round 2 (error corrected reads,-k 1 -o 1000 -l 5); and finally round 3 (raw reads, -k 1 -o 1000 -l 30). Scaffolds were then polished with three rounds of arrow (raw PacBio reads, smrttools-release_6.0.0.47835, in --diploid mode), and five rounds of Pilon (raw PacBio reads mapped with minimap2, 150 bp Illumina read pairs mapped with BWA-mem).…”
Section: Flow Cytometrymentioning
confidence: 99%
“…Haplotigs and either extremely high or extremely low coverage contigs were removed (using purge haplotigs (a-70)) (Roach, Schmidt, and Borneman 2018). Remaining contigs were upgraded using FinisherSC (Lam et al 2015) and scaffolded using iterative rounds of SSPACE_longreads (Boetzer and Pirovano 2014) and gapFinisher (Kammonen et al 2019): Round 1 (error corrected reads, -k 1 -o 1000 -l 10 -g 500); round 2 (error corrected reads,-k 1 -o 1000 -l 5); and finally round 3 (raw reads, -k 1 -o 1000 -l 30). Scaffolds were then polished with three rounds of arrow (raw PacBio reads, smrttools-release_6.0.0.47835, in --diploid mode), and five rounds of Pilon (raw PacBio reads mapped with minimap2, 150 bp Illumina read pairs mapped with BWA-mem).…”
Section: Flow Cytometrymentioning
confidence: 99%
“…For S. scabiei, all Illumina short reads were assembled with SPAdes v3.13.1 85 into the first draft genome. Then, scaffolding was performed by SSPACE Basic v2.0 75 with paired-end Illumina short reads, followed by gap filling using GapFiller v1.10 86 and GapFinisher v1.1 87 . Finally, sequence polishing was completed by Pilon v1.22 79 with all reads and the final genome assembly was generated.…”
Section: Genome and Transcriptome Assemblymentioning
confidence: 99%
“…Then, the 10x reads were incorporated by short-read polishing using Pilon (Pilon, RRID:SCR_014731) v1.23 [51] with reads mapped using Minimap2 v2.12 [41] and correcting for indels only; we found correcting for indels only resulted in a higher BUSCO score than correcting for indels and SNPs following the steps described in this section. We scaffolded using SSPACE-LongRead v1.1 [52] followed by gap-filling using gapFinisher v20190917 [53]. The assembly was scaffolded from 209 contigs into 138 scaffolds, however, no gaps were filled.…”
Section: Assembly Polishing and Gap-fillingmentioning
confidence: 99%