2018
DOI: 10.1093/nar/gky215
|View full text |Cite
|
Sign up to set email alerts
|

DeFine: deep convolutional neural networks accurately quantify intensities of transcription factor-DNA binding and facilitate evaluation of functional non-coding variants

Abstract: The complex system of gene expression is regulated by the cell type-specific binding of transcription factors (TFs) to regulatory elements. Identifying variants that disrupt TF binding and lead to human diseases remains a great challenge. To address this, we implement sequence-based deep learning models that accurately predict the TF binding intensities to given DNA sequences. In addition to accurately classifying TF-DNA binding or unbinding, our models are capable of accurately predicting real-valued TF bindi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
78
0
1

Year Published

2018
2018
2022
2022

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 99 publications
(85 citation statements)
references
References 56 publications
3
78
0
1
Order By: Relevance
“…The detection of TADs is not as sensitive to resolution decline as algorithms for detecting TADs, we obtained roughly the same results when using the Hi-C data with various downsampling ratios [23].…”
Section: Deephic Is More Precise In Detecting Tad Boundariessupporting
confidence: 62%
“…The detection of TADs is not as sensitive to resolution decline as algorithms for detecting TADs, we obtained roughly the same results when using the Hi-C data with various downsampling ratios [23].…”
Section: Deephic Is More Precise In Detecting Tad Boundariessupporting
confidence: 62%
“…Due to the common misconception that rst convolutional layer lters learn motifs (Alipanahi et al , 2015;Kelley et al , 2016;Quang and Xie, 2016;Angermueller et al , 2016;Cuperus et al , 2017;Chen et al , 2018;Kelley et al , 2018;Bretschneider et al , 2018;Ben-Bassat et al , 2018;Wang et al , 2018;Gao et al , 2018;Trabelsi et al , 2019), deep learning practitioners continue to employ CNN architectures with large rst layer lters with the intent of capturing motif patterns in their entirety. However, we have shown that employing a large lter does not necessarily lead to whole motif representations.…”
Section: Motif Representations Are Not Very Sensitive To 1st Layer Ltmentioning
confidence: 99%
“…A common method to validate a trained CNN is to demonstrate that rst layer lters have learned biologically meaningful representations, i.e. PWM-like representations of sequence motifs (Alipanahi et al , 2015;Kelley et al , 2016;Quang and Xie, 2016;Angermueller et al , 2016;Cuperus et al , 2017;Chen et al , 2018;Kelley et al , 2018;Bretschneider et al , 2018;Ben-Bassat et al , 2018;Wang et al , 2018;Gao et al , 2018;Trabelsi et al , 2019). The few studies that perform a quantitative motif comparison of the rst layer lters against a motif database nd that less than 50% have a statistically signicant match (Kelley et al , 2016;Quang and Xie, 2016).…”
Section: Introductionmentioning
confidence: 99%
“…More recently, the rapid development of deep learning techniques has enabled mining in high-dimensional sequences data. Some examples include DeepSEA (Zhou & Troyanskaya, 2015), DeepBind (Alipanahi, Delong, Weirauch, & Frey, 2015), DanQ (Quang & Xie, 2016), Define (Wang, Tai, E, & Wei, 2018), and Basenji (Kelley et al, 2018). However, because data sets used for training in those algorithms vary, comparisons across different models can become a problem considering there is currently no goldstandard for evaluation (Nishizaki & Boyle, 2017).…”
Section: Introductionmentioning
confidence: 99%