2020
DOI: 10.1101/2020.07.22.211482
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

De novo protein design by deep network hallucination

Abstract: There has been considerable recent progress in protein structure prediction using deep neural networks to infer distance constraints from amino acid residue co-evolution1–3. We investigated whether the information captured by such networks is sufficiently rich to generate new folded proteins with sequences unrelated to those of the naturally occuring proteins used in training the models. We generated random amino acid sequences, and input them into the trRosetta structure prediction network to predict starting… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
49
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 49 publications
(49 citation statements)
references
References 21 publications
(27 reference statements)
0
49
0
Order By: Relevance
“…The Fleishman laboratory has developed methods achieving rich structure diversity through combining native protein fragments (Lapidoth et al., 2015), and has successfully designed enzymes with comparable characteristics to native enzymes (Netzer et al., 2018). Other recently developed methods use machine learning to directly generate protein sequences for desired or novel protein folds (Anand, Eguchi, Derry, Altman, & Huang, 2020; Anishchenko, Chidyausiku, Ovchinnikov, Pellock, & Baker, 2020), which at the same time can provide proteins with great structural diversity. Other methods include SEWING (Jacobs et al., 2016), junction fusion protein creation (Brunette et al., 2020), and loop‐helix‐loop unit combinatorial sampling (Pan et al., 2020).…”
Section: Commentarymentioning
confidence: 99%
“…The Fleishman laboratory has developed methods achieving rich structure diversity through combining native protein fragments (Lapidoth et al., 2015), and has successfully designed enzymes with comparable characteristics to native enzymes (Netzer et al., 2018). Other recently developed methods use machine learning to directly generate protein sequences for desired or novel protein folds (Anand, Eguchi, Derry, Altman, & Huang, 2020; Anishchenko, Chidyausiku, Ovchinnikov, Pellock, & Baker, 2020), which at the same time can provide proteins with great structural diversity. Other methods include SEWING (Jacobs et al., 2016), junction fusion protein creation (Brunette et al., 2020), and loop‐helix‐loop unit combinatorial sampling (Pan et al., 2020).…”
Section: Commentarymentioning
confidence: 99%
“… 192 CNN (input design) same as trRosetta 27 out of 129 sequence-structure pairs experimentally validated Anishchenko et al. 193 CNN, convolutional neural network; DCGAN, deep convolutional generative adversarial network; GAN, generative adversarial network; VAE, variational autoencoder. …”
Section: Protein Designmentioning
confidence: 89%
“…Anishchenko et al. 193 iterated sequences through the DL network, trRosetta, 103 to “hallucinate” 215 mutually compatible sequence-structure pairs in a manner similar to “input design”. 183 By maximizing the contrast between the distance distributions predicted by trRosetta and a background network trained on noise, they obtained new sequences with geometric maps with sharp geometric features.…”
Section: Protein Designmentioning
confidence: 99%
See 1 more Smart Citation
“…The free hallucination loss is the KL divergence between the network predictions ( y ) and a background distribution ( b ) ( 7 ) .…”
Section: Oss Loss Lossmentioning
confidence: 99%