The platform will undergo maintenance on Sep 14 at about 7:45 AM EST and will be unavailable for approximately 2 hours.
2022
DOI: 10.1101/2022.12.22.521582
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

LegNet: a best-in-class deep learning model for short DNA regulatory regions

Abstract: Parallel reporter assays provide rich data to decipher gene regulatory regions with deep learning. Here we introduce LegNet, a convolutional network architecture that secured the first place for our autosome.org team in the DREAM 2022 challenge of predicting gene expression from gigantic parallel reporter assays. To construct LegNet, we drew inspiration from EfficientNetV2 and reformulated the sequence-to-expression regression problem as a soft-classification task. Here, with published data, we demonstrate tha… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(11 citation statements)
references
References 26 publications
(36 reference statements)
0
10
0
Order By: Relevance
“…Our results also support the notion that epigenetic divergence is primarily driven by sequence divergence. While neural networks have shown promise in predicting epigenetic features and gene expression levels from DNA sequence 39,41,49 , there is still a gap between current approaches and experiment-level predictions. While recent advances have been considerable, work in neural network scaling suggests improvements in model accuracy grow following a power law, requiring an exponential increase in both model and dataset size 50 .…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Our results also support the notion that epigenetic divergence is primarily driven by sequence divergence. While neural networks have shown promise in predicting epigenetic features and gene expression levels from DNA sequence 39,41,49 , there is still a gap between current approaches and experiment-level predictions. While recent advances have been considerable, work in neural network scaling suggests improvements in model accuracy grow following a power law, requiring an exponential increase in both model and dataset size 50 .…”
Section: Discussionmentioning
confidence: 99%
“…8a). We adapt Legnet 39 to this task which has achieved state of the art prediction accuracy for short sequence MPRA activity. We trained our model on three species and evaluated on a fourth unseen species (Fig.…”
Section: Deep Learning Models Predict Cell-type Specific Chromatin Ac...mentioning
confidence: 99%
See 1 more Smart Citation
“…LegNets (Penzar et al, 2022): As mentioned in Section 2, LegNets were the best predictors of PE in yeast in the DREAM challenge. We benchmark two LegNets -one with the same structure as the model that won the challenge, and a larger one with more filters in every convolutional layer.…”
Section: Mtlucifermentioning
confidence: 88%
“…The advent of next-generation sequencing and additional high-throughput technologies has catalyzed the accumulation and public deposition of extensive databases, rich with functional genomic elements, enabling the broad application of computational methods to large-scale genomic data analysis [2]. We, along with others [3], have successfully employed machine-learning methods, including ensemble learning [4] and convolutional neural networks [5, 6], for this purpose. However, while potent, these approaches encounter constraints in identifying long-range dependencies within DNA sequences, a common phenomenon in human and other eukaryotic genomes [7].…”
Section: Mainmentioning
confidence: 99%