2019
DOI: 10.1101/868935
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fast and flexible design of novel proteins using graph neural networks

Abstract: Protein structure and function is determined by the arrangement of the linear sequence of amino acids in 3D space. Despite substantial advances, precisely designing sequences that fold into a predetermined shape (the "protein design" problem) remains difficult. We show that a deep graph neural network, ProteinSolver, can solve protein design by phrasing it as a constraint satisfaction problem (CSP). To sidestep the considerable issue of optimizing the network architecture, we first develop a network that is ac… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(19 citation statements)
references
References 46 publications
0
19
0
Order By: Relevance
“…Previous machine learning models for the task of residue prediction conditioned on chemical environment have only been applied to single-shot residue prediction or architecture class, secondary structure, or ∆∆G prediction [60,89,62,63,61]. Other deep learning approaches have been developed for sequence design or rotamer packing [90,91,92,93,94,95], but most of these methods' designs have not been comprehensively validated by a range of biochemical metrics or by folding in silico or in vitro.…”
Section: Discussionmentioning
confidence: 99%
“…Previous machine learning models for the task of residue prediction conditioned on chemical environment have only been applied to single-shot residue prediction or architecture class, secondary structure, or ∆∆G prediction [60,89,62,63,61]. Other deep learning approaches have been developed for sequence design or rotamer packing [90,91,92,93,94,95], but most of these methods' designs have not been comprehensively validated by a range of biochemical metrics or by folding in silico or in vitro.…”
Section: Discussionmentioning
confidence: 99%
“… 197 3D CNN gridded atomic coordinates PDB-REDO 19,436 sequence recovery 70%, experimental validation of mutation Shroff et al. 198 ProteinSolver Graph NN partial sequence, adjacency matrix UniParc residues sequence recovery of 35%, folding and MD test with 4 proteins Strokach et al, 2019 199 gcWGAN CGAN random noise + structure SCOPe 20,125 diversity and TM score of prediction from designed sequence cVAE Karimi et al. 200 Graph Transformer backbone structure in graph CATH based 18,025 perplexity: 6.56 (rigid), 11.13 (flexible) (random: 20.00) Ingraham et al.…”
Section: Protein Designmentioning
confidence: 99%
“…They were able to validate designed sequences in silico and demonstrate that some designs folded to their target structures in vitro . 213 …”
Section: Protein Designmentioning
confidence: 99%
“…Networks with architectures borrowed from language models have been trained on amino acid sequences, and been used to generate new sequences without considering protein structure explicitly 4,5 . Other methods have been developed to generate protein backbones without consideration of sequence 6 , and to identify amino acid sequences which either fit well onto specified backbone structures [7][8][9] or are conditioned on low-dimensional fold representation 10 ; models tailored to generate sequences and/or structures for specific protein families have also been developed [11][12][13][14] . However, none of the models described to date address the classical de novo protein design problem: generating new sequences predicted to fold to new structures.…”
Section: Introductionmentioning
confidence: 99%