2020
DOI: 10.1021/acssynbio.0c00345
|View full text |Cite
|
Sign up to set email alerts
|

Discovery of Novel Gain-of-Function Mutations Guided by Structure-Based Deep Learning

Abstract: Despite the promise of deep learning accelerated protein engineering, examples of such improved proteins are scarce. Here we report that a 3D convolutional neural network trained to associate amino acids with neighboring chemical microenvironments can guide identification of novel gain-of-function mutations that are not predicted by energetics-based approaches. Amalgamation of these mutations improved protein function in vivo across three diverse proteins by at least 5-fold. Furthermore, this model provides a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

3
132
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 90 publications
(135 citation statements)
references
References 30 publications
3
132
0
Order By: Relevance
“…We use a 3D convolutional neural network as our classifier θ , training the model on X-ray crystal structures of CATH 4.2 S95 domains [31, 32, 33], with train and test set domains separated at the topology level. For the amino acid type prediction task, our conditional model achieves a 57.3% test set accuracy, either outperforming [34] or matching [35, 36, 37] previously reported machine learning models for the same task. The predictions of the network correspond well with biochemically justified substitutability of the amino acids (Fig.…”
supporting
confidence: 63%
See 1 more Smart Citation
“…We use a 3D convolutional neural network as our classifier θ , training the model on X-ray crystal structures of CATH 4.2 S95 domains [31, 32, 33], with train and test set domains separated at the topology level. For the amino acid type prediction task, our conditional model achieves a 57.3% test set accuracy, either outperforming [34] or matching [35, 36, 37] previously reported machine learning models for the same task. The predictions of the network correspond well with biochemically justified substitutability of the amino acids (Fig.…”
supporting
confidence: 63%
“…For the conditional residue prediction task, our classifier gives an improvement of 14.7% over [34] (42.5%), and similar performance to [37] (52.4%), [35] (56.4%) and [36] (58.0%). We note that we do not use the same train/test sets as these studies.…”
Section: Supplementary Textmentioning
confidence: 73%
“…If the improvements seen during the evolution of their Go-playing reinforcement-learning-based programs [54,165,166] are anything of a guide, we may soon anticipate considerable further improvements. Similar comments might be made about the activities of specific protein sequences [167][168][169][170].…”
Section: Protein Structure Predictionmentioning
confidence: 93%
“…40% accuracy. We have since improved upon this algorithm by including additional filters for features such as the presence of hydrogen atoms, partial charge, and solvent accessibility, and have ultimately improved the accuracy of re-prediction to upwards of 70% (28). We were interested in the approximately 30% of amino acids that were not predicted to be wild-type; while this might have merely reflected the inaccuracy of the neural network, it was also possible that nature itself was 'underpredicting' the fit of a given amino acid to its microenvironment, and that the ensemble of predicted non-wild-type amino acids at a given position might represent opportunities for mutation.…”
Section: Machine Learning Predictions Improve Br512 Functionmentioning
confidence: 99%
“…We were interested in the approximately 30% of amino acids that were not predicted to be wild-type; while this might have merely reflected the inaccuracy of the neural network, it was also possible that nature itself was 'underpredicting' the fit of a given amino acid to its microenvironment, and that the ensemble of predicted non-wild-type amino acids at a given position might represent opportunities for mutation. To this end, we instantiated the ability to predict underperforming wild-type amino acid residues in proteins by predicting either positions or precise mutations that led to improvements in function across a variety of proteins, including blue fluorescent protein and phosphomannose isomerase (28). It should be noted that such predictions were not for a particular functionality, such as stability or catalysis, but merely for goodness of fit, with improvements in stability and catalysis being empirical outcomes of those predictions.…”
Section: Machine Learning Predictions Improve Br512 Functionmentioning
confidence: 99%