2021
DOI: 10.1186/s13059-020-02218-6
|View full text |Cite
|
Sign up to set email alerts
|

An interpretable bimodal neural network characterizes the sequence and preexisting chromatin predictors of induced transcription factor binding

Abstract: Background Transcription factor (TF) binding specificity is determined via a complex interplay between the transcription factor’s DNA binding preference and cell type-specific chromatin environments. The chromatin features that correlate with transcription factor binding in a given cell type have been well characterized. For instance, the binding sites for a majority of transcription factors display concurrent chromatin accessibility. However, concurrent chromatin features reflect the binding a… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 16 publications
(15 citation statements)
references
References 69 publications
0
15
0
Order By: Relevance
“…We chose neural networks owing to their ability to learn arbitrarily complex predictive sequence patterns ( Kelley et al 2018 ; Fudenberg et al 2020 ; Avsec et al 2021a , b ; Koo et al 2021 ). In particular, hybrid convolutional and recurrent network architectures have successfully been applied to accurately predict TF binding in diverse applications ( Quang and Xie 2016 ; Quang and Xie 2019 ; Srivastava et al 2021 ). The motivation behind these architectures is that convolutional filters can encode binding site motifs and other contiguous sequence features, whereas the recurrent layers can model flexible, higher-order spatial organization of these features.…”
Section: Resultsmentioning
confidence: 99%
“…We chose neural networks owing to their ability to learn arbitrarily complex predictive sequence patterns ( Kelley et al 2018 ; Fudenberg et al 2020 ; Avsec et al 2021a , b ; Koo et al 2021 ). In particular, hybrid convolutional and recurrent network architectures have successfully been applied to accurately predict TF binding in diverse applications ( Quang and Xie 2016 ; Quang and Xie 2019 ; Srivastava et al 2021 ). The motivation behind these architectures is that convolutional filters can encode binding site motifs and other contiguous sequence features, whereas the recurrent layers can model flexible, higher-order spatial organization of these features.…”
Section: Resultsmentioning
confidence: 99%
“…It would be important for ML and computational biology researchers to work together to identify additional important steps. For example, many computational biology applications involve training a multi-modal prediction models (e.g., Srivastava et al (2021) incorporates DNA sequence with epigenetic signals to predict TF binding and Chen et al (2020) leverages both cancer histology images and genomic features for survival prediction). An open research question is to define a verification step to check whether explanations properly attributes importance scores to each modality.…”
Section: Discussionmentioning
confidence: 99%
“…As for CNN-plus, (i) we just simply encoded DNA sequences and chromatin accessibility signals, but more appropriate methods for dealing with the inputs need to be further explored, e.g, k -mer embedding representing high-order dependencies of nucleotides (48,49), a bimodal neural network for separately handling DNA sequences and chromatin accessibility signals (50); (ii) we just considered a simple CNN architecture composed of three convolutional layers, but advanced DL-based models also need to be further explored, e.g, hybrid neural networks (51,52), transformer architectures (53,54), since they have been proved to be equipped with stronger feature learning ability. Given the massive data being generated by the ENCODE Consortium and other large-scale efforts, there is an excellent opportunity to learn richer representations to more fully understand cell-type-specific and shared binding activities.…”
Section: Discussionmentioning
confidence: 99%