2016
DOI: 10.1093/bib/bbw089
|View full text |Cite
|
Sign up to set email alerts
|

Computational pan-genomics: status, promises and challenges

Abstract: Many disciplines, from human genetics and oncology to plant breeding, microbiology and virology, commonly face the challenge of analyzing rapidly increasing numbers of genomes. In case of Homo sapiens, the number of sequenced genomes will approach hundreds of thousands in the next few years. Simply scaling up established bioinformatics pipelines will not be sufficient for leveraging the full potential of such rich genomic data sets. Instead, novel, qualitatively different computational methods and paradigms ar… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
105
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 186 publications
(120 citation statements)
references
References 157 publications
0
105
0
Order By: Relevance
“…Computational pan-genomics has recently emerged as an important sub-branch of bioinformatics [35] . One of the motivations is the analysis of sequencing data in the context of whole species.…”
Section: Simplitigs Of Bacterial Pan-genomesmentioning
confidence: 99%
“…Computational pan-genomics has recently emerged as an important sub-branch of bioinformatics [35] . One of the motivations is the analysis of sequencing data in the context of whole species.…”
Section: Simplitigs Of Bacterial Pan-genomesmentioning
confidence: 99%
“…The resulting lack of diversity introduces a systematic bias that makes samples look more like the reference genome [20]. This reference bias can be reduced by using pangenomic models, which incorporate the genomic content of populations of individuals [1]. Sequence graphs are a popular representation of pangenomes that can express all of the variation in a pangenome [13].…”
Section: Introductionmentioning
confidence: 99%
“…17 However, some reads cannot be aligned to a reference genome, particularly those 18 originating from highly polymorphic regions and regions absent from the reference genome. 19 Reference genome alignments are also generally done without awareness of variation, 20 causing mapping bias towards the reference allele and misalignments around indels 10,11 .…”
mentioning
confidence: 99%
“…Although approaches 2 that find polymorphisms in reference-free assemblies have been developed to avoid these 3 limitations 16,17 , de novo assembly algorithms remain computationally expensive, have less 4 sensitivity 17 , and use data structures that have a complex coordinate system. 5 Pangenomes 12,18,19 have recently been proposed to counter weaknesses of both reference 6 alignments and de novo assemblies by extending the linear reference alignments with 7 variation-aware alignments 20 . Pangenomes incorporate prior information about variation, 8 allowing read aligners to better distinguish between sequencing errors in reads and true 9 sequence variation.…”
mentioning
confidence: 99%
See 1 more Smart Citation