2021
DOI: 10.1186/s13059-020-02229-3
|View full text |Cite
|
Sign up to set email alerts
|

Reference flow: reducing reference bias using multiple population genomes

Abstract: Most sequencing data analyses start by aligning sequencing reads to a linear reference genome, but failure to account for genetic variation leads to reference bias and confounding of results downstream. Other approaches replace the linear reference with structures like graphs that can include genetic variation, incurring major computational overhead. We propose the reference flow alignment method that uses multiple population reference genomes to improve alignment accuracy and reduce reference bias. Compared t… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
59
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
3
1

Relationship

1
9

Authors

Journals

citations
Cited by 66 publications
(60 citation statements)
references
References 55 publications
1
59
0
Order By: Relevance
“…Instead, large-scale reference panels from a wide range of populations can provide similar information [5,6]. Recent studies use such information to improve alignment accuracy and reduce biases in alignment [11][12][13], but there has been little work to incorporate population data with variant calling.…”
Section: Introductionmentioning
confidence: 99%
“…Instead, large-scale reference panels from a wide range of populations can provide similar information [5,6]. Recent studies use such information to improve alignment accuracy and reduce biases in alignment [11][12][13], but there has been little work to incorporate population data with variant calling.…”
Section: Introductionmentioning
confidence: 99%
“…Full amelioration of reference bias likely requires a genome from all six species in the study, which are not currently available. If they do become available, the optimal strategy for reducing reference bias is not yet clear, but strategies like aligning the resequencing data to multiple assemblies, e.g., [ 65 , 66 ] or to a Vitis pangenome may prove fruitful. Fortunately, however, at least two studies have evaluated reference bias in population genetics analyses, and they concluded that the effect of the reference bias is unlikely to bias broad demographic and evolutionary genomic analyses [ 62 , 63 ].…”
Section: Discussionmentioning
confidence: 99%
“…For years, a myriad of genomic analyses have been conducted by mapping to this genome, including functional annotation 44 , disease association [45][46][47][48] and genetic ancestry tests [49][50][51][52] . However, the nature of mapping restricts its resolution to identify novel variants other than those found in the initially recruited samples 53,54 . Our proposed GABOLA assembly integrates the merits of linked read technology and optical mapping, while providing our novelty in locating associated reads in potentially difficult-to-solve regions.…”
Section: Contribution To Human Reference Genomementioning
confidence: 99%