2023
DOI: 10.1093/bioinformatics/btad457
|View full text |Cite
|
Sign up to set email alerts
|

LegNet: a best-in-class deep learning model for short DNA regulatory regions

Abstract: Motivation The increasing volume of data from high-throughput experiments including parallel reporter assays facilitates the development of complex deep learning approaches for modeling DNA regulatory grammar. Results Here we introduce LegNet, an EfficientNetV2-inspired convolutional network for modeling short gene regulatory regions. By approaching the sequence-to-expression regression problem as a soft classification task, … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 19 publications
0
2
0
Order By: Relevance
“…The following architectural choices were used in the final model: (i) grouped convolution (59) instead of the depthwise convolution of the original EfficientNetV2, (ii) the standard residual blocks were substituted with residual channel-wise concatenations, (iii) a bilinear layer was inserted in the middle of the EfficientNetV2 SE-block. A detailed study on this architecture is presented in (60).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The following architectural choices were used in the final model: (i) grouped convolution (59) instead of the depthwise convolution of the original EfficientNetV2, (ii) the standard residual blocks were substituted with residual channel-wise concatenations, (iii) a bilinear layer was inserted in the middle of the EfficientNetV2 SE-block. A detailed study on this architecture is presented in (60).…”
Section: Methodsmentioning
confidence: 99%
“…It contains modifications like replacing depthwise convolution with grouped convolution, using Squeeze and Excitation (SE) blocks (67), and adopting channel-wise concatenation for residual connections. The channel configuration starts with 256 channels for the initial block, followed by 128, 128, 64, 64, 64, and 64 channels (60).…”
Section: Prix Fixe Net: (I) Dream-cnnmentioning
confidence: 99%
“…Despite a recent push to favor large‐scale attention transformer models in this field, some researchers have argued that despite excellent performance in protein structure prediction, text mining, and genomic data analysis, the quality of transformer models can be overestimated under certain test scenarios. [ 194,195 ] Concerns also persist regarding their ability to effectively capture long‐range interactions. [ 194 ]…”
Section: Applying Machine Learning Techniques To Decipher the Cis‐reg...mentioning
confidence: 99%
“…A relevant development has been LegNet, a CNN for modeling short gene regulatory regions that achieved first rank in predicting promoter expression from a gigantic parallel reporter assay at the DREAM 2022 challenge. [ 195 ] The authors highlight that fully convolutional networks should be recognized as a dependable method for computationally modeling short gene regulatory regions and predicting the consequences of regulatory sequence modifications. However, ultimately, it is critical to remember that the effectiveness of machine learning and AI models hinges on the quality of experimental data, with current limitations in wet lab techniques contributing to challenges in precisely defining enhancers across the genome and occasionally leading to poor reproducibility even in replicates of the same experiment.…”
Section: Applying Machine Learning Techniques To Decipher the Cis‐reg...mentioning
confidence: 99%
“…In addition, while many of these models accurately predict TF binding and accessible chromatin in the genome, they are trained on indirect proxies of cis-regulation, and therefore less accurately predict cis-regulatory activity. Models trained on massively parallel reporter gene assays (MPRAs) 19,[32][33][34][35][36][37] do predict cisregulatory activity directly. Still, MPRA studies must also contend with the same fundamental limitation: the number of genomic training examples in any particular cell type is small relative to the scale of the training data typically needed to model the interactions defining cis-regulatory grammars 38 .…”
Section: Introductionmentioning
confidence: 99%