2020
DOI: 10.1186/s13059-020-02038-8
|View full text |Cite
|
Sign up to set email alerts
|

Personalized and graph genomes reveal missing signal in epigenomic data

Abstract: Background: Epigenomic studies that use next generation sequencing experiments typically rely on the alignment of reads to a reference sequence. However, because of genetic diversity and the diploid nature of the human genome, we hypothesize that using a generic reference could lead to incorrectly mapped reads and bias downstream results. Results: We show that accounting for genetic variation using a modified reference genome or a de novo assembled genome can alter histone H3K4me1 and H3K27ac ChIP-seq peak cal… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

4
30
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 38 publications
(37 citation statements)
references
References 55 publications
4
30
0
Order By: Relevance
“…Graph-based [ 17 , 22 ] and personalized reference genomes [ 5 , 23 ] mitigate reference allele bias. Existing linear reference coordinates can serve as backbones for variation-aware genome graphs.…”
Section: Introductionmentioning
confidence: 99%
“…Graph-based [ 17 , 22 ] and personalized reference genomes [ 5 , 23 ] mitigate reference allele bias. Existing linear reference coordinates can serve as backbones for variation-aware genome graphs.…”
Section: Introductionmentioning
confidence: 99%
“…The extra peaks are influenced by rescued unmapped reads and by the shift of peaks across the statistical significance threshold of the peak caller [29]. This multi-sample epigenomic dataset is an opportunity to better understand how reliable the altered peaks are compared to common peaks.…”
Section: Resultsmentioning
confidence: 99%
“…Previously, we introduced an axis that orders reference genomes according to how similar they are to the genome they are meant to represent [29]. Our axis ranged from the least similar sequence (the reference genome) to the most similar (the de novo assembly).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…But linearity leads to reference bias: a tendency to miss alignments or report incorrect alignments for reads containing non-reference alleles. This can ultimately lead to confounding of scientific results, especially for analyses concerned with hypervariable regions 2 , allele-specific effects [3][4][5][6] , ancient DNA analysis 7,8 or epigenenomic signals 9 . These problems can be more or less adverse depending on the individual under study; e.g.…”
Section: Introductionmentioning
confidence: 99%