Superfamily of alpha-beta hydrolases is one of the largest groups of structurally related enzymes with diverse catalytic functions. Bioinformatic analysis was used to study how lipase and amidase catalytic activities are implemented into the same structural framework. Subfamily-specific positions--conserved within lipases and peptidases but different between them--that were supposed to be responsible for functional discrimination have been identified. Mutations at subfamily-specific positions were used to introduce amidase activity into Candida antarctica lipase B (CALB). Molecular modeling was implemented to evaluate influence of selected residues on binding and catalytic conversion of amide substrate by corresponding library of mutants. In silico screening was applied to select reactive enzyme-substrate complexes that satisfy knowledge-based criteria of amidase catalytic activity. Selected CALB variants with substitutions at subfamily-specific positions Gly39, Thr103, Trp104, and Leu278 were produced and showed significant improvement of experimentally measured amidase activity. Based on these results, we suggest that value of subfamily-specific positions should be further explored in order to develop a systematic tool to study structure-function relationship in enzymes and to use this information for rational enzyme engineering.
Understanding the role of specific amino acid residues in the molecular mechanism of a protein's function is one of the most challenging problems in modern biology. A systematic bioinformatic analysis of protein families and superfamilies can help in the study of structure–function relationships and in the design of improved variants of enzymes/proteins, but represents a methodological challenge. The pyridoxal‐5′‐phosphate ( PLP )‐dependent enzymes are catalytically diverse and include the aspartate aminotransferase superfamily which implements a common structural framework known as type fold I. In this work, the recently developed bioinformatic online methods Mustguseal and Zebra were used to collect and study a large representative set of the aspartate aminotransferase superfamily with high structural, but low sequence similarity to l ‐threonine aldolase from Aeromonas jandaei ( LTA aj), in order to identify conserved positions that provide general properties in the superfamily, and to reveal family‐specific positions (FSPs) responsible for functional diversity. The roles of the identified residues in the catalytic mechanism and reaction specificity of LTA aj were then studied by experimental site‐directed mutagenesis and molecular modelling. It was shown that FSPs determine reaction specificity by coordinating the PLP cofactor in the enzyme's active centre, thus influencing its activation and the tautomeric equilibrium of the intermediates, which can be used as hotspots to modulate the protein's functional properties. Mutagenesis at the selected FSPs in LTA aj led to a reduction in a native catalytic activity and increased the rate of promiscuous reactions. The results provide insight into the structural basis of catalytic promiscuity of the PLP ‐dependent enzymes and demonstrate the potential of bioinformatic analysis in studying structure–function relationship in protein superfamilies.
Protein stability provides advantageous development of novel properties and can be crucial in affording tolerance to mutations that introduce functionally preferential phenotypes. Consequently, understanding the determining factors for protein stability is important for the study of structure-function relationship and design of novel protein functions. Thermal stability has been extensively studied in connection with practical application of biocatalysts. However, little work has been done to explore the mechanism of pH-dependent inactivation. In this study, bioinformatic analysis of the Ntn-hydrolase superfamily was performed to identify functionally important subfamily-specific positions in protein structures. Furthermore, the involvement of these positions in pH-induced inactivation was studied. The conformational mobility of penicillin acylase in Escherichia coli was analyzed through molecular modeling in neutral and alkaline conditions. Two functionally important subfamily-specific residues, Gluβ482 and Aspβ484, were found. Ionization of these residues at alkaline pH promoted the collapse of a buried network of stabilizing interactions that consequently disrupted the functional protein conformation. The subfamily-specific position Aspβ484 was selected as a hotspot for mutation to engineer enzyme variant tolerant to alkaline medium. The corresponding Dβ484N mutant was produced and showed 9-fold increase in stability at alkaline conditions. Bioinformatic analysis of subfamily-specific positions can be further explored to study mechanisms of protein inactivation and to design more stable variants for the engineering of homologous Ntn-hydrolases with improved catalytic properties.
The ability of proteins and enzymes to maintain a functionally active conformation under adverse environmental conditions is an important feature of biocatalysts, vaccines, and biopharmaceutical proteins. From an evolutionary perspective, robust stability of proteins improves their biological fitness and allows for further optimization. Viewed from an industrial perspective, enzyme stability is crucial for the practical application of enzymes under the required reaction conditions. In this review, we analyze bioinformatic-driven strategies that are used to predict structural changes that can be applied to wild type proteins in order to produce more stable variants. The most commonly employed techniques can be classified into stochastic approaches, empirical or systematic rational design strategies, and design of chimeric proteins. We conclude that bioinformatic analysis can be efficiently used to study large protein superfamilies systematically as well as to predict particular structural changes which increase enzyme stability. Evolution has created a diversity of protein properties that are encoded in genomic sequences and structural data. Bioinformatics has the power to uncover this evolutionary code and provide a reproducible selection of hotspots - key residues to be mutated in order to produce more stable and functionally diverse proteins and enzymes. Further development of systematic bioinformatic procedures is needed to organize and analyze sequences and structures of proteins within large superfamilies and to link them to function, as well as to provide knowledge-based predictions for experimental evaluation.
Zebra2 is a highly automated web-tool to search for subfamily-specific and conserved positions (i.e. the determinants of functional diversity as well as the key catalytic and structural residues) in protein superfamilies. The bioinformatic analysis is facilitated by Mustguseal—a companion web-server to automatically collect and superimpose a large representative set of functionally diverse homologs with high structure similarity but low sequence identity to the selected query protein. The results are automatically prioritized and provided at four information levels to facilitate the knowledge-driven expert selection of the most promising positions on-line: as a sequence similarity network; interfaces to sequence-based and 3D-structure-based analysis of conservation and variability; and accompanied by the detailed annotation of proteins accumulated from the integrated databases with links to the external resources. The integration of Zebra2 and Mustguseal web-tools provides the first of its kind out-of-the-box open-access solution to conduct a systematic analysis of evolutionarily related proteins implementing different functions within a shared 3D-structure of the superfamily, determine common and specific patterns of function-associated local structural elements, assist to select hot-spots for rational design and to prepare focused libraries for directed evolution. The web-servers are free and open to all users at https://biokinet.belozersky.msu.ru/zebra2, no login required.
Proteins within a single family usually share a common function but differ in more specific features and can be divided into subfamilies with different properties. Availability of genomic, structural, and functional information implemented into numerous databases provides new opportunities for bioinformatic analysis of homologous proteins. In this work, new method of bioinformatic analysis has been developed to identify subfamily-specific positions (SSPs)--conserved only within protein subfamilies, but different between subfamilies--that seem to play important role in functional diversity. A novel scoring function is suggested to consider structural information as well as physicochemical and residue conservation in protein subfamilies. Random shuffling is performed to rank results by significance, and Bernoulli statistics is applied to calculate p-values. Algorithm does not require predefined subfamily classification and can propose it automatically by graph-based clustering. This method can be used as a tool to explore SSPs with different structural localization in order to understand their implication to structure-function relationship and protein function. Web interface to the program is available at http://biokinet.belozersky.msu.ru/zebra.
The visualCMAT web-server was designed to assist experimental research in the fields of protein/enzyme biochemistry, protein engineering, and drug discovery by providing an intuitive and easy-to-use interface to the analysis of correlated mutations/co-evolving residues. Sequence and structural information describing homologous proteins are used to predict correlated substitutions by the Mutual information-based CMAT approach, classify them into spatially close co-evolving pairs, which either form a direct physical contact or interact with the same ligand (e.g. a substrate or a crystallographic water molecule), and long-range correlations, annotate and rank binding sites on the protein surface by the presence of statistically significant co-evolving positions. The results of the visualCMAT are organized for a convenient visual analysis and can be downloaded to a local computer as a content-rich all-in-one PyMol session file with multiple layers of annotation corresponding to bioinformatic, statistical and structural analyses of the predicted co-evolution, or further studied online using the built-in interactive analysis tools. The online interactivity is implemented in HTML5 and therefore neither plugins nor Java are required. The visualCMAT web-server is integrated with the Mustguseal web-server capable of constructing large structure-guided sequence alignments of protein families and superfamilies using all available information about their structures and sequences in public databases. The visualCMAT web-server can be used to understand the relationship between structure and function in proteins, implemented at selecting hotspots and compensatory mutations for rational design and directed evolution experiments to produce novel enzymes with improved properties, and employed at studying the mechanism of selective ligand's binding and allosteric communication between topologically independent sites in protein structures. The web-server is freely available at https://biokinet.belozersky.msu.ru/visualcmat and there are no login requirements.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.