One of the main challenges for protein redesign is the efficient evaluation of a combinatorial number of candidate structures. The modeling of protein flexibility, typically by using a rotamer library of commonly-observed low-energy side-chain conformations, further increases the complexity of the redesign problem. A dominant algorithm for protein redesign is Dead-End Elimination (DEE), which prunes the majority of candidate conformations by eliminating rigid rotamers that provably are not part of the Global Minimum Energy Conformation (GMEC). The identified GMEC consists of rigid rotamers (i.e., rotamers that have not been energy-minimized) and is thus referred to as the rigid-GMEC. As a post-processing step, the conformations that survive DEE may be energy-minimized. When energy minimization is performed after pruning with DEE, the combined protein design process becomes heuristic, and is no longer provably accurate: a conformation that is pruned using rigid-rotamer energies may subsequently minimize to a lower energy than the rigid-GMEC. That is, the rigid-GMEC and the conformation with the lowest energy among all energy-minimized conformations (the minimized-GMEC) are likely to be different. While the traditional DEE algorithm succeeds in not pruning rotamers that are part of the rigid-GMEC, it makes no guarantees regarding the identification of the minimized-GMEC. In this paper we derive a novel, provable, and efficient DEE-like algorithm, called minimized-DEE (MinDEE), that guarantees that rotamers belonging to the minimized-GMEC will not be pruned, while still pruning a combinatorial number of conformations. We show that MinDEE is useful not only in identifying the minimized-GMEC, but also as a filter in an ensemble-based scoring and search algorithm for protein redesign that exploits energy-minimized conformations. We compare our results both to our previous computational predictions of protein designs and to biological activity assays of predicted protein mutants. Our provable and efficient minimized-DEE algorithm is applicable in protein redesign, protein-ligand binding prediction, and computer-aided drug design.
Realization of novel molecular function requires the ability to alter molecular complex formation. Enzymatic function can be altered by changing enzyme-substrate interactions via modification of an enzyme's active site. A redesigned enzyme may either perform a novel reaction on its native substrates or its native reaction on novel substrates. A number of computational approaches have been developed to address the combinatorial nature of the protein redesign problem. These approaches typically search for the global minimum energy conformation among an exponential number of protein conformations. We present a novel algorithm for protein redesign, which combines a statistical mechanics-derived ensemble-based approach to computing the binding constant with the speed and completeness of a branch-and-bound pruning algorithm. In addition, we developed an efficient deterministic approximation algorithm, capable of approximating our scoring function to arbitrary precision. In practice, the approximation algorithm decreases the execution time of the mutation search by a factor of ten. To test our method, we examined the Phe-specific adenylation domain of the nonribosomal peptide synthetase gramicidin synthetase A (GrsA-PheA). Ensemble scoring, using a rotameric approximation to the partition functions of the bound and unbound states for GrsA-PheA, is first used to predict binding of the wildtype protein and a previously described mutant (selective for leucine), and second, to switch the enzyme specificity toward leucine, using two novel active site sequences computationally predicted by searching through the space of possible active site mutations. The top scoring in silico mutants were created in the wetlab and dissociation/binding constants were determined by fluorescence quenching. These tested mutations exhibit the desired change in specificity from Phe to Leu. Our ensemble-based algorithm, which flexibly models both protein and ligand using rotamer-based partition functions, has application in enzyme redesign, the prediction of protein-ligand binding, and computer-aided drug design.
The two subunits of core binding factor (Runx1 and CBFbeta) play critical roles in hematopoiesis and are frequent targets of chromosomal translocations found in leukemia. The binding of the CBFbeta-smooth muscle myosin heavy chain (SMMHC) fusion protein to Runx1 is essential for leukemogenesis, making this a viable target for treatment. We have developed inhibitors with low micromolar affinity which effectively block binding of Runx1 to CBFbeta. NMR-based docking shows that these compounds bind to CBFbeta at a site displaced from the binding interface for Runx1, that is, these compounds function as allosteric inhibitors of this protein-protein interaction, a potentially generalizable approach. Treatment of the human leukemia cell line ME-1 with these compounds shows decreased proliferation, indicating these are good candidates for further development.
We have determined the crystal structure of dihydrofolate reductase-thymidylate synthase (DHFR-TS) from Cryptosporidium hominis, revealing a unique linker domain containing an 11-residue ␣-helix that has extensive interactions with the opposite DHFR-TS monomer of the homodimeric enzyme. Analysis of the structure of DHFR-TS from C. hominis and of previously solved structures of DHFR-TS from Plasmodium falciparum and Leishmania major reveals that the linker domain primarily controls the relative orientation of the DHFR and TS domains. Using the tertiary structure of the linker domains, we have been able to place a number of protozoa in two distinct and dissimilar structural families corresponding to two evolutionary families and provide the first structural evidence validating the use of DHFR-TS as a tool of phylogenetic classification. Furthermore, the structure of C. hominis DHFR-TS calls into question surface electrostatic channeling as the universal means of dihydrofolate transport between TS and DHFR in the bifunctional enzyme.Thymidylate synthase (TS) 1 and dihydrofolate reductase (DHFR) are essential enzymes in the cell cycle of all organisms, since they catalyze the production of dTMP, required for DNA replication. TS converts the substrate, dUMP, to dTMP by reductive methylation using the cofactor, 5,10-methylene tetrahydrofolate, and releases dihydrofolate (1). In the presence of the cofactor NADPH, DHFR reduces dihydrofolate to tetrahydrofolate. The folate cycle is completed by serine hydroxymethyl transferase, which converts tetrahydrofolate back to 5,10-methylene tetrahydrofolate.Recently, Stechmann and Cavalier-Smith (2) have addressed the problem of locating the root of the eukaryotic tree, one of the most challenging evolutionary problems. In several protozoa, including Alveolates and Euglenozoa, and in some plants, the genes for DHFR and TS are translated as a single polypeptide, forming a bifunctional enzyme (DHFR-TS), whereas in most animals, fungi, and bacteria, these two enzymes are monofunctional. The monofunctional form of DHFR is a monomer, and that of TS is a dimer. The currently held hypothesis is that the primordial form of DHFR and TS is the monofunctional form and that the genes for DHFR and TS became fused at a single evolutionary point. If the DHFR-TS gene fusion occurred just once, then the fused gene provides an excellent phylogenetic marker, since reversing the fusion would require multiple genetic events. Stechmann and Cavalier-Smith have used the derived gene fusion between DHFR and TS to place the root of the tree below the common ancestor of plants, Alveolates, and Euglenozoa (Fig.
We have developed an algorithm called Q5 for probabilistic classi cation of healthy versus disease whole serum samples using mass spectrometry. The algorithm employs principal components analysis (PCA) followed by linear discriminant analysis (LDA) on whole spectrum surface-enhanced laser desorption/ionization time of ight (SELDI-TOF) mass spectrometry (MS) data and is demonstrated on four real datasets from complete, complex SELDI spectra of human blood serum. Q5 is a closed-form, exact solution to the problem of classi cation of complete mass spectra of a complex protein mixture. Q5 employs a probabilistic classi cation algorithm built upon a dimension-reduced linear discriminant analysis. Our solution is computationally ef cient; it is noniterative and computes the optimal linear discriminant using closed-form equations. The optimal discriminant is computed and veri ed for datasets of complete, complex SELDI spectra of human blood serum. Replicate experiments of different training/testing splits of each dataset are employed to verify robustness of the algorithm. The probabilistic classi cation method achieves excellent performance. We achieve sensitivity, speci city, and positive predictive values above 97% on three ovarian cancer datasets and one prostate cancer dataset. The Q5 method outperforms previous full-spectrum complex sample spectral classi cation techniques and can provide clues as to the molecular identities of differentially expressed proteins and peptides.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.