2022
DOI: 10.1038/s43588-022-00273-6
|View full text |Cite|
|
Sign up to set email alerts
|

Rotamer-free protein sequence design based on deep learning and self-consistency

Abstract: We present ABACUS-R, a method based on deep learning for designing amino acid sequences that autonomously fold into a given target backbone. This method predicts the sidechain type of a central residue from its 3D local environment by using an encoder-decoder network trained with a multi-task learning strategy. The environmental features encoded by the network include the types but not the conformations of the sidechains of surrounding residues. This eliminates the needs for reconstructing and optimizing sidec… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
59
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
8
1

Relationship

2
7

Authors

Journals

citations
Cited by 40 publications
(63 citation statements)
references
References 61 publications
1
59
0
Order By: Relevance
“…Ingraham [35] trained a encoder-decoder Structured Transformer , where the GNN encoder learnt protein structures represented by graphs, while the decoder sampled sequences conditioned on the encoder-learn structure representations. Another encoder-decoder, ABACUS-R [69] , took backbone structural features and sidechain types for surrounding residues of a residue as input to an encoder, and employed a decoder to output the sidechain type for the given residue. MIF [70] adapted Ingraham’s [35] architecture to a bidirectional denoising model.…”
Section: The Deep Learning Era Of Protein Sequence and Structure Gene...mentioning
confidence: 99%
“…Ingraham [35] trained a encoder-decoder Structured Transformer , where the GNN encoder learnt protein structures represented by graphs, while the decoder sampled sequences conditioned on the encoder-learn structure representations. Another encoder-decoder, ABACUS-R [69] , took backbone structural features and sidechain types for surrounding residues of a residue as input to an encoder, and employed a decoder to output the sidechain type for the given residue. MIF [70] adapted Ingraham’s [35] architecture to a bidirectional denoising model.…”
Section: The Deep Learning Era Of Protein Sequence and Structure Gene...mentioning
confidence: 99%
“…DenseCPD learns the atom distribution information from the structures using DenseNet [ 130 ] (CNN derived), and predicts the probability of amino acids that build the input protein backbone. This approach displayed higher accuracy than the later released ABACUS-R [ 131 ], despite ABACUS-R relying on Transformer to extract more information from both protein sequence and structure. The aim of using DenseCPD is to find the most suitable sequences for the protein backbone, and this model is currently supported only for tasks submitted online.…”
Section: De Novo Design Of Food Enzymesmentioning
confidence: 99%
“…Then, we examined the effects of the GAN losses by comparing two models: one (PriorDDPM) trained without GAN and the other (PriorDDPM-GAN) trained with GAN (more specifically, the PriorDDPM was trained first till full convergence of its training losses, and then the PriorDDPM-GAN was tuned from the trained PriorDDPM by using extra learning epochs with the GAN losses). Two types of metrics have been considered: one type were the structural deviations between the OSs and the natural ISs, and the other type were the so-called self-consistent structure deviations, which are the deviations between the OSs and the structures predicted from sequences designed on the OSs (here we applied ABACUS-R, a deep learning method for fixed backbone sequence design [16], to select amino acid sequences on the OSs, and used AlphaFold2 to predict the potential folded structures from the selected sequences). The results in Figure 2B shows that PriorDDPM-GAN outperforms PriorDDPM by large margins in both types of metrics.…”
Section: The Effects Of Different Components Of the Modelmentioning
confidence: 99%
“…To circumvent these difficulties, machine learning methods, and more recently deep learning methods, are being increasingly applied [7,8,9]. For structure-based sequence design, deep learning methods achieving outstanding native sequence recovery rate have been developed [10,11,12,13,14,15], with several methods demonstrated to be much more robust and more accurate in wet experimental tests [16,17] than conventional energy minimization approaches. The more challenging problem of de novo structure design is also being actively investigated [18,7,4,19].…”
Section: Introductionmentioning
confidence: 99%