2022
DOI: 10.1101/2022.04.29.490059
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Evaluating deep learning for predicting epigenomic profiles

Abstract: Deep learning has been successful at predicting epigenomic profiles from DNA sequences. Most approaches frame this task as a binary classification relying on peak callers to define functional activity. Recently, quantitative models have emerged to directly predict the experimental coverage values as a regression. As new models continue to emerge with different architectures and training configurations, a major bottleneck is forming due to the lack of ability to fairly assess the novelty of proposed models and … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
8
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 77 publications
0
8
0
Order By: Relevance
“…9). We also found that gradient correction worked well for various CNNs trained to predict quantitative levels of normalized read-coverage of 15 ATAC-seq datasets at base-resolution 19 (Fig. 2e, Supplementary Fig.…”
mentioning
confidence: 70%
See 1 more Smart Citation
“…9). We also found that gradient correction worked well for various CNNs trained to predict quantitative levels of normalized read-coverage of 15 ATAC-seq datasets at base-resolution 19 (Fig. 2e, Supplementary Fig.…”
mentioning
confidence: 70%
“…We acquired the test data and the trained CNN-base and CNN-32 models with exponential activations and ReLU activations from Ref. 19 ; a total of 4 models. Each CNN takes as input 2kb length sequences and outputs a prediction of normalized read-coverage for 15 ATAC-seq bigWig tracks (i.e.…”
Section: Datamentioning
confidence: 99%
“…In the context of BPNet [2], the authors therefore performed peak calling on ChIP-nexus data to select a set of regions highly enriched in count signal. However, recent work by Toneyan et al [51] suggests that peak callers select sites too conservatively, which may result in under-fitting of sequence-to-signal models.…”
Section: Methodsmentioning
confidence: 99%
“…With image data, basic affine transformations can translate, magnify, or rotate an image without changing its label. For genomics, the available neutral augmentations are reverse-complement transformations 16 and small random translations of the input sequence 17, 18 . With the finite size of experimental data and a paucity of augmentation methods, there exist only limited strategies to promote generalization for genomic DNNs.…”
mentioning
confidence: 99%
“…An important downstream application of genomic DNNs is scoring the functional consequences of mutations. Following previous procedures 2, 17 , we compared model predictions with saturation mutagenesis of 15 cis -regulatory elements measured experimentally with a massively parallel reporter assay – data collected through the CAGI5 Challenge 21 . As expected, EvoAugtrained DNNs outperformed standard training on this out-of-distribution generalization task (Fig.…”
mentioning
confidence: 99%