2022
DOI: 10.1101/2022.06.25.497605
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Accurate and efficient protein sequence design through learning concise local environment of residues

Abstract: Computational protein sequence design has been widely applied in rational protein engineering and increasing the design accuracy and efficiency is highly desired. Here we present an accurate and efficient design approach called ProDESIGNLE, which uses a concise but informative representation of residue’s local environment and trains a transformer to predict a residue’s amino acid type from its local environment. Using the trained transformer, ProDESIGN-LE iteratively selects an appropriate residue at a random … Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(10 citation statements)
references
References 37 publications
(45 reference statements)
1
9
0
Order By: Relevance
“…5A. Alanine has the highest frequency in both categories, and this is consistent with the BLOSUM62 matrix, indicating A has a high substitution probability with C. Interestingly, mutations that are more likely to be associated with improved enzyme thermostability involve conservative replacements, such as S and T, which may have less influence on local polarity and volume [73]. Typically, considerable proteins possess unbounded terminus and the stabilization of these unbounded short regions can greatly improve the overall thermostability of the enzyme.…”
Section: Non-covalent Interactions Around Mutation Sitessupporting
confidence: 71%
“…5A. Alanine has the highest frequency in both categories, and this is consistent with the BLOSUM62 matrix, indicating A has a high substitution probability with C. Interestingly, mutations that are more likely to be associated with improved enzyme thermostability involve conservative replacements, such as S and T, which may have less influence on local polarity and volume [73]. Typically, considerable proteins possess unbounded terminus and the stabilization of these unbounded short regions can greatly improve the overall thermostability of the enzyme.…”
Section: Non-covalent Interactions Around Mutation Sitessupporting
confidence: 71%
“…Additionally, improvements were observed in the metrics of exposed hydrophobics. We further compared GeoSeqBuilder with representative deep learning sequence design approaches, including graph based ProteinMPNN [28], 3D-CNN based method [22], and transformer-based ABACUS-R [27] and ProDESIGN-LE [50]. The average pLDDT and TM-score (78.…”
Section: Model Evaluation On Natural and De Novo Designed Structures ...mentioning
confidence: 99%
“…MIF [70] adapted Ingraham’s [35] architecture to a bidirectional denoising model. ProDESIGN-LE [71] inputs structural local environments to three encoder layers to output a distribution over the 20 residue types. Graph Vector Perceptrons ( GVP-GNN ) [72] which replaced multilayer perceptrons (MLPs) in GNNs improved performance in protein design and model quality assessment.…”
Section: The Deep Learning Era Of Protein Sequence and Structure Gene...mentioning
confidence: 99%