2019
DOI: 10.1101/744755
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SPAligner: Alignment of Long Diverged Molecular Sequences to Assembly Graphs

Abstract: Background: Graph-based representation of genome assemblies has been recently used in different applications -from gene finding to haplotype separation. While most of these applications are based on the alignment of molecular sequences to assembly graphs, existing software tools for finding such alignments have important limitations. Results: We present a novel SPAligner tool for aligning long diverged molecular sequences to assembly graphs and demonstrate that SPAligner is an efficient solution for mapping th… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
4
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 38 publications
0
4
0
Order By: Relevance
“…All practical long read to DAG aligners that scale to large genomes rely on seed–filter–extend methodology [8,22,25,27,34]. The first step is to find a set of anchors which indicate short exact matches, e.g., k -mer or minimizer matches, between substrings of a sequence to subpaths in a DAG.…”
Section: Introductionmentioning
confidence: 99%
“…All practical long read to DAG aligners that scale to large genomes rely on seed–filter–extend methodology [8,22,25,27,34]. The first step is to find a set of anchors which indicate short exact matches, e.g., k -mer or minimizer matches, between substrings of a sequence to subpaths in a DAG.…”
Section: Introductionmentioning
confidence: 99%
“…In particular, their ability to more scalably represent diverse collections of sequences partially addresses the reference bias issues common in current standard analysis protocols [10]. To allow for sensitive sequence search in this new domain, many tools generalize pairwise sequence-to-sequence alignment (i.e., computing the minimum edit distance or the maximum similarity score between a query and a target sequence) algorithms to align queries against sequence graphs [1,2,54,33,29,16,37,51,52,16]. These sequence-to-graph alignment tools perform a search on the graph to align a query against the spellings of one or more walks.…”
Section: Introductionmentioning
confidence: 99%
“…These sequence-to-graph alignment tools perform a search on the graph to align a query against the spellings of one or more walks. Depending on their choice of scoring model, these tools can be categorized into those that use the graph as a compact representation of disjoint reference sequences for sequence-to-sequence alignment [1,2,33], those that align to species pangenomes [20,54,50,41], or those that treat the entire graph as a single sample [30,24,40,16,29,51,52,16]. Throughout this work, we use the convention that alignment scores measure sequence similarity, and hence, higher values are better.…”
Section: Introductionmentioning
confidence: 99%
“…The two basic strategies for assembly are reference based methods that align reads to a reference genome [ 11 16 ] and de-novo methods that use the overlaps among the reads themselves for assembly, without the need for a reference sequence [ 17 22 ]. Tools that align reads to genome graphs, such as, GraphAligner [ 23 ] and SPAligner [ 24 ], are emerging and can also be used for producing assemblies. In this manuscript, we present two tools: SAUTE (Sequence Assembly Using Target Enrichment) and SAUTE_PROT .…”
Section: Introductionmentioning
confidence: 99%