2022
DOI: 10.1007/s11047-022-09882-6
|View full text |Cite
|
Sign up to set email alerts
|

Computational graph pangenomics: a tutorial on data structures and their applications

Abstract: Computational pangenomics is an emerging research field that is changing the way computer scientists are facing challenges in biological sequence analysis. In past decades, contributions from combinatorics, stringology, graph theory and data structures were essential in the development of a plethora of software tools for the analysis of the human genome. These tools allowed computational biologists to approach ambitious projects at population scale, such as the 1000 Genomes Project. A major contribution of the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
27
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 28 publications
(30 citation statements)
references
References 97 publications
0
27
0
Order By: Relevance
“…We denote a matrix of h rows and w columns as M[1.. h ][1.. w ]. We let col(M) j denote the j -th column of M, i.e., the string drawn as col(M) j = M[1.. h ][ j ] = M[1][ j ]M[2][ j ] … M[ h ][ j ].…”
Section: Definitionsmentioning
confidence: 99%
See 3 more Smart Citations
“…We denote a matrix of h rows and w columns as M[1.. h ][1.. w ]. We let col(M) j denote the j -th column of M, i.e., the string drawn as col(M) j = M[1.. h ][ j ] = M[1][ j ]M[2][ j ] … M[ h ][ j ].…”
Section: Definitionsmentioning
confidence: 99%
“…In the framework of computational pangenomics, the positional BWT (PBWT), which is a method of permuting the elements of each column of a h × w binary matrix M[1.. h ][1.. w ], is a key instrument in the compact representation of large haplotypes data sets [9]. Indeed, due to the intrinsic capability of the PBWT of saving space in memorizing haplotype data and even in analyzing large haplotypes panels, it is becoming a relevant data structure for pangenomics (see the recent tutorial on data structures [2]). It is used in relevant computational steps related to haplotype phasing and analysis, such as the matching procedure in reference panels of haplotypes.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…The latest version of industry-standard DRAGEN software by Illumina now uses a pangenome graph for mapping reads in highly polymorphic regions of a human genome [13]. For surveys of the recent algorithmic developments in this area, see [2,6,10,35]. Among the many computational tasks associated with pangenome graphs, sequence-to-graph alignment remains a core computational problem.…”
Section: Introductionmentioning
confidence: 99%