2018
DOI: 10.1101/278762
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

AIControl: Replacing matched control experiments with machine learning improves ChIP-seq peak identification

Abstract: Motivation: Accurately identifying the binding sites of regulatory proteins remains a central and unresolved challenge in molecular biology. The most commonly used experimental technique to determine binding locations of transcription factors is chromatin immunoprecipitation followed by DNA sequencing (ChIP-seq). Because ChIP-seq is highly susceptible to background noise, the current practice obtains one matched "control" ChIP-seq dataset and estimates position-wise background distributions using ChIP-seq sign… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2018
2018
2019
2019

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 52 publications
0
2
0
Order By: Relevance
“…But plausibly, some kinds of peaks or some genomic regions are more likely to be false positives than others. Indeed, other ongoing work in ChIP-seq analysis aims at uncovering and removing local biases in ChIP-seq signals that can unduly influence peak calling (Hiranuma et al , 2016, 2018; Ramachandran et al , 2015). This suggests that peak-specific P -value corrections might be desirable, although it is unclear how this can best be done.…”
Section: Discussionmentioning
confidence: 99%
“…But plausibly, some kinds of peaks or some genomic regions are more likely to be false positives than others. Indeed, other ongoing work in ChIP-seq analysis aims at uncovering and removing local biases in ChIP-seq signals that can unduly influence peak calling (Hiranuma et al , 2016, 2018; Ramachandran et al , 2015). This suggests that peak-specific P -value corrections might be desirable, although it is unclear how this can best be done.…”
Section: Discussionmentioning
confidence: 99%
“…For example, one could use generative adversarial networks (GANs) to generate data with the properties of real data and then use the created data to normalize the real data. Future approaches may include integrated strategies, where normalization is intrinsic to a specific type of analysis (e.g., [343]), and generic tools, which normalize the data that can then be used as input to any downstream analysis (e.g., [344,345,346]).…”
Section: Combining Mixed-technology Datamentioning
confidence: 99%