Rotamer-free protein sequence design based on deep learning and self-consistency

Liu, Yufeng; Zhang, Lu; Wang, Weilun; Zhu, Min; Wang, Chenchen; Li, Fudong; Zhang, Jiahai; Li, Houqiang; Chen, Quan

doi:10.1038/s43588-022-00273-6

Cited by 40 publications

(63 citation statements)

References 61 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Ingraham [35] trained a encoder-decoder Structured Transformer , where the GNN encoder learnt protein structures represented by graphs, while the decoder sampled sequences conditioned on the encoder-learn structure representations. Another encoder-decoder, ABACUS-R [69] , took backbone structural features and sidechain types for surrounding residues of a residue as input to an encoder, and employed a decoder to output the sidechain type for the given residue. MIF [70] adapted Ingraham’s [35] architecture to a bidirectional denoising model.…”

Section: The Deep Learning Era Of Protein Sequence and Structure Gene...mentioning

confidence: 99%

From sequence to function through structure: Deep learning for protein design

Ferruz¹,

Heinzinger²,

Akdel³

et al. 2023

Computational and Structural Biotechnology Journal

View full text Add to dashboard Cite

Section: The Deep Learning Era Of Protein Sequence and Structure Gene...mentioning

confidence: 99%

From sequence to function through structure: Deep learning for protein design

Ferruz¹,

Heinzinger²,

Akdel³

et al. 2023

Computational and Structural Biotechnology Journal

View full text Add to dashboard Cite

“…DenseCPD learns the atom distribution information from the structures using DenseNet [ 130 ] (CNN derived), and predicts the probability of amino acids that build the input protein backbone. This approach displayed higher accuracy than the later released ABACUS-R [ 131 ], despite ABACUS-R relying on Transformer to extract more information from both protein sequence and structure. The aim of using DenseCPD is to find the most suitable sequences for the protein backbone, and this model is currently supported only for tasks submitted online.…”

Section: De Novo Design Of Food Enzymesmentioning

confidence: 99%

Possibilities of Using De Novo Design for Generating Diverse Functional Food Enzymes

Wang

Tan

et al. 2023

IJMS

View full text Add to dashboard Cite

Food enzymes have an important role in the improvement of certain food characteristics, such as texture improvement, elimination of toxins and allergens, production of carbohydrates, enhancing flavor/appearance characteristics. Recently, along with the development of artificial meats, food enzymes have been employed to achieve more diverse functions, especially in converting non-edible biomass to delicious foods. Reported food enzyme modifications for specific applications have highlighted the significance of enzyme engineering. However, using direct evolution or rational design showed inherent limitations due to the mutation rates, which made it difficult to satisfy the stability or specific activity needs for certain applications. Generating functional enzymes using de novo design, which highly assembles naturally existing enzymes, provides potential solutions for screening desired enzymes. Here, we describe the functions and applications of food enzymes to introduce the need for food enzymes engineering. To illustrate the possibilities of using de novo design for generating diverse functional proteins, we reviewed protein modelling and de novo design methods and their implementations. The future directions for adding structural data for de novo design model training, acquiring diversified training data, and investigating the relationship between enzyme–substrate binding and activity were highlighted as challenges to overcome for the de novo design of food enzymes.

show abstract

“…Then, we examined the effects of the GAN losses by comparing two models: one (PriorDDPM) trained without GAN and the other (PriorDDPM-GAN) trained with GAN (more specifically, the PriorDDPM was trained first till full convergence of its training losses, and then the PriorDDPM-GAN was tuned from the trained PriorDDPM by using extra learning epochs with the GAN losses). Two types of metrics have been considered: one type were the structural deviations between the OSs and the natural ISs, and the other type were the so-called self-consistent structure deviations, which are the deviations between the OSs and the structures predicted from sequences designed on the OSs (here we applied ABACUS-R, a deep learning method for fixed backbone sequence design [16], to select amino acid sequences on the OSs, and used AlphaFold2 to predict the potential folded structures from the selected sequences). The results in Figure 2B shows that PriorDDPM-GAN outperforms PriorDDPM by large margins in both types of metrics.…”

Section: The Effects Of Different Components Of the Modelmentioning

confidence: 99%

“…To circumvent these difficulties, machine learning methods, and more recently deep learning methods, are being increasingly applied [7,8,9]. For structure-based sequence design, deep learning methods achieving outstanding native sequence recovery rate have been developed [10,11,12,13,14,15], with several methods demonstrated to be much more robust and more accurate in wet experimental tests [16,17] than conventional energy minimization approaches. The more challenging problem of de novo structure design is also being actively investigated [18,7,4,19].…”

Section: Introductionmentioning

confidence: 99%

De novo protein backbone generation based on diffusion with structured priors and adversarial training

Liu

Chen²,

Liu

2022

Preprint

Self Cite

View full text Add to dashboard Cite

In de novo deisgn of protein backbones with deep generative methods, the designability or physical plausibility of the generated backbones needs to be emphasized. Here we report SCUBA-D, a method using denoising diffusion with priors of non-zero means to transform a low quality initial backbone into a high quality backbone. SCUBA-D has been developed by gradually adding new components to a basic denoising diffusion module to improve the physical plausibility of the denoised backbone. It comprises a module that uese one-step denoising to generate prior backbones, followed by a high resolution denoising diffusion module, in which structure diffusion is assisted by the simultaneous diffusion of a language model representation of the amino acid sequence. To ensure high physical plausibility of the denoised output backbone, multiple generative adversarial network (GAN)-style discriminators are used to provide additional losses in training. We have computationally evaluated SCUBA-D by applying structure prediction to amino acid sequences designed on the denoised backbones. The results suggest that SCUBA-D can generate high quality backbones from initial backbones that contain noises of various types or magnitudes, such as initial backbones coarsely sketched to follow certain overall shapes, or initial backbones comprising well-defined functional sites connected by unknown scaffolding regions.

show abstract

Rotamer-free protein sequence design based on deep learning and self-consistency

Cited by 40 publications

References 61 publications

From sequence to function through structure: Deep learning for protein design

From sequence to function through structure: Deep learning for protein design

Possibilities of Using De Novo Design for Generating Diverse Functional Food Enzymes

De novo protein backbone generation based on diffusion with structured priors and adversarial training

Contact Info

Product

Resources

About