2021
DOI: 10.1038/s41467-021-25371-3
|View full text |Cite
|
Sign up to set email alerts
|

Epistatic Net allows the sparse spectral regularization of deep neural networks for inferring fitness functions

Abstract: Despite recent advances in high-throughput combinatorial mutagenesis assays, the number of labeled sequences available to predict molecular functions has remained small for the vastness of the sequence space combined with the ruggedness of many fitness functions. While deep neural networks (DNNs) can capture high-order epistatic interactions among the mutational sites, they tend to overfit to the small number of labeled sequences available for training. Here, we developed Epistatic Net (EN), a method for spect… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1

Relationship

3
6

Authors

Journals

citations
Cited by 28 publications
(25 citation statements)
references
References 29 publications
0
23
0
Order By: Relevance
“…Some algorithms specialize in modeling epistasis. Epistatic Net ( 27 ) introduced a neural network regularization strategy to limit the number of epistatic interactions. Other approaches focused on the global epistasis that arises due to a nonlinear transformation from a latent phenotype to the experimentally characterized function ( 28 , 29 ).…”
Section: Discussionmentioning
confidence: 99%
“…Some algorithms specialize in modeling epistasis. Epistatic Net ( 27 ) introduced a neural network regularization strategy to limit the number of epistatic interactions. Other approaches focused on the global epistasis that arises due to a nonlinear transformation from a latent phenotype to the experimentally characterized function ( 28 , 29 ).…”
Section: Discussionmentioning
confidence: 99%
“…Future work in this direction could focus on combining DAEs and RNNs for sequence based representation, and Graph Convolutional Networks (GCNs) for structure based as well as PPI based information. Combining these representations in a hierarchical classifier such as the multi-task DNN with biologically-relevant regularization methods 42 , 43 could allow for an explainable and computationally feasible DL architecture for protein function prediction.…”
Section: Major Successes Of DLmentioning
confidence: 99%
“…To make them human-interpretable, the input gets regularized using closed-form density functions of the data or GANs that mimic the data distribution. Methods that address the explainability question use more direct ways to gain insights from the NN function using their Taylor expansion 125 or Fourier transform 42 , 62 . The explanation takes the form of a heatmap which shows the importance of each input feature.…”
Section: General Challenges For DL In the Biosciencesmentioning
confidence: 99%
“…For comparison we also fit an additive model using ordinary least squares and regularized pairwise and 3-way regression models. Since both L 1 and L 2 regularized regression have been used to infer genotype-phenotype maps [39,51,[64][65][66], here we fit the pairwise and three-way models using elastic net regression (see SI Appendix ) where the penalty term for model complexity is a mixture of L 1 and L 2 norms [67] and the relative weight of the two penalties is chosen via crossvalidation. In addition to the linear regression models, we also fit a global epistasis model [45] where the binding score is modeled as a nonlinear transformation of a latent additive phenotype on which each possible mutation has a background-independent e↵ect (SI Appendix ).…”
Section: Application To Protein Gmentioning
confidence: 99%