2023
DOI: 10.1101/2023.06.21.545871
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Co-linear Chaining on Pangenome Graphs

Abstract: Pangenome reference graphs are useful in genomics because they compactly represent the genetic diversity within a species, a capability that linear references lack. However, efficiently aligning sequences to these graphs with complex topology and cycles can be challenging. The seed-chain-extend based alignment algorithms use co-linear chaining as a standard technique to identify a good cluster of exact seed matches that can be combined to form an alignment. Recent works show how the co-linear chaining problem … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(10 citation statements)
references
References 53 publications
0
6
0
Order By: Relevance
“…Several versions of co-linear chaining problems have been studied for aligning two sequences [1,20,32,37,10,11,40]. Recent works have further studied the extension of chaining on acyclic [33,30,6,46] and cyclic pangenome graphs [3,43] but these formulations do not consider the haplotype paths. .y] in the DAG (Figure 3).…”
Section: Proposed Algorithms and Complexity Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…Several versions of co-linear chaining problems have been studied for aligning two sequences [1,20,32,37,10,11,40]. Recent works have further studied the extension of chaining on acyclic [33,30,6,46] and cyclic pangenome graphs [3,43] but these formulations do not consider the haplotype paths. .y] in the DAG (Figure 3).…”
Section: Proposed Algorithms and Complexity Analysismentioning
confidence: 99%
“…These formulations do not consider the associations between genetic variants and may lead to alignments with spurious recombinations in variant-dense regions of the graph [42]. The existing formulations for co-linear chaining on graphs share the same limitation [6,33,46,30,43]. Chaining on DAGs can be solved in O(KN log KN ) time, where K is the minimum number of paths covering all the vertices and N is the number of exact matches between the query and the DAG [6,30].…”
Section: Introductionmentioning
confidence: 99%
“…Some works determine a traversal distance between nodes on-the-fly by traversing local neighbourhoods around nodes [29,30]. More efficient strategies require a decomposition of the graph, typically into subgraphs [28,47] or a path/walk cover [31][32][33]46]. After chaining, anchor extension searches the graph forwards and backwards from the ends of each anchor to find high-scoring walks.…”
Section: Sequence-to-graph Alignment With Seed-extend-chainmentioning
confidence: 99%
“…We pre-process the graph and generate walks using the procedure described by [33]. To reduce the final representation size, we refrain from maintaining a matrix of traversal distances from each node to each stored walk.…”
Section: Simulating Assembly Graphs and Query Sequencesmentioning
confidence: 99%
See 1 more Smart Citation