Shushan Toneyan scite author profile

Deep learning has been successful at predicting epigenomic profiles from DNA sequences. Most approaches frame this task as a binary classification relying on peak callers to define functional activity. Recently, quantitative models have emerged to directly predict the experimental coverage values as a regression. As new models continue to emerge with different architectures and training configurations, a major bottleneck is forming due to the lack of ability to fairly assess the novelty of proposed models and their utility for downstream biological discovery. Here we introduce a unified evaluation framework and use it to compare various binary and quantitative models trained to predict chromatin accessibility data. We highlight various modeling choices that affect generalization performance, including a downstream application of predicting variant effects. In addition, we introduce a robustness metric that can be used to enhance model selection and improve variant effect predictions. Our empirical study largely supports that quantitative modeling of epigenomic profiles leads to better generalizability and interpretability.

show abstract

EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations

Lee

Tang

Toneyan

et al. 2023

Genome Biol

View full text Add to dashboard Cite

Deep neural networks (DNNs) hold promise for functional genomics prediction, but their generalization capability may be limited by the amount of available data. To address this, we propose EvoAug, a suite of evolution-inspired augmentations that enhance the training of genomic DNNs by increasing genetic variation. Random transformation of DNA sequences can potentially alter their function in unknown ways, so we employ a fine-tuning procedure using the original non-transformed data to preserve functional integrity. Our results demonstrate that EvoAug substantially improves the generalization and interpretability of established DNNs across prominent regulatory genomics prediction tasks, offering a robust solution for genomic DNNs.

show abstract

Deconvolution of expression for nascent RNA-sequencing data (DENR) highlights pre-RNA isoform diversity in human cells

Zhao

Dukler

Barshad

et al. 2021

View full text Add to dashboard Cite

Motivation Quantification of isoform abundance has been extensively studied at the mature-RNA level using RNA-seq but not at the level of precursor RNAs using nascent RNA sequencing. Results We address this problem with a new computational method called Deconvolution of Expression for Nascent RNA sequencing data (DENR), which models nascent RNA sequencing read counts as a mixture of user-provided isoforms. The baseline algorithm is enhanced by machine-learning predictions of active transcription start sites and an adjustment for the typical “shape profile” of read counts along a transcription unit. We show that DENR outperforms simple read-count-based methods for estimating gene and isoform abundances, and that transcription of multiple pre-RNA isoforms per gene is widespread, with frequent differences between cell types. In addition, we provide evidence that a majority of human isoform diversity derives from primary transcription rather than from post-transcriptional processes. Availability DENR and nascentRNASim are freely available at https://github.com/CshlSiepelLab/DENR (version v1.0.0) and https://github.com/CshlSiepelLab/nascentRNASim (version v0.3.0). Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

ETV6 dependency in Ewing sarcoma by antagonism of EWS-FLI1-mediated enhancer activation

Gao

et al. 2023

Nat Cell Biol

View full text Add to dashboard Cite

Evaluating deep learning for predicting epigenomic profiles

Toneyan

Tang

Koo

2022

Preprint

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shushan Toneyan

Evaluating deep learning for predicting epigenomic profiles

EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations

Deconvolution of expression for nascent RNA-sequencing data (DENR) highlights pre-RNA isoform diversity in human cells

ETV6 dependency in Ewing sarcoma by antagonism of EWS-FLI1-mediated enhancer activation

Evaluating deep learning for predicting epigenomic profiles

Contact Info

Product

Resources

About