An analysis of the structurally and catalytically diverse serine hydrolase protein family in the Saccharomyces cerevisiae proteome was undertaken using two independent but complementary, large-scale approaches. The first approach is based on computational analysis of serine hydrolase active site structures; the second utilizes the chemical reactivity of the serine hydrolase active site in complex mixtures. These proteomics approaches share the ability to fractionate the complex proteome into functional subsets. Each method identified a significant number of sequences, but 15 proteins were identified by both methods. Eight of these were unannotated in the Saccharomyces Genome Database at the time of this study and are thus novel serine hydrolase identifications. Three of the previously uncharacterized proteins are members of a eukaryotic serine hydrolase family, designated as Fsh (family of serine hydrolase), identified here for the first time. OVCA2, a potential human tumor suppressor, and DYR SCHPO, a dihydrofolate reductase from Schizosaccharomyces pombe, are members of this family. Comparing the combined results to results of other proteomic methods showed that only four of the 15 proteins were identified in a recent large-scale, "shotgun" proteomic analysis and eight were identified using a related, but similar, approach (neither identifies function). Only 10 of the 15 were annotated using alternate motif-based computational tools. The results demonstrate the precision derived from combining complementary, function-based approaches to extract biological information from complex proteomes. The chemical proteomics technology indicates that a functional protein is being expressed in the cell, while the computational proteomics technology adds details about the specific type of function and residue that is likely being labeled. The combination of synergistic methods facilitates analysis, enriches true positive results, and increases confidence in novel identifications. This work also highlights the risks inherent in annotation transfer and the use of scoring functions for determination of correct annotations.
A function annotation method using the sequence-to-structure-to-function paradigm is applied to the identification of all disulfide oxidoreductases in the Saccharomyces cerevisiae genome. The method identifies 27 sequences as potential disulfide oxidoreductases. All previously known thioredoxins, glutaredoxins, and disulfide isomerases are correctly identified. Three of the 27 predictions are probable false-positives. Three novel predictions, which subsequently have been experimentally validated, are presented. Two additional novel predictions suggest a disulfide oxidoreductase regulatory mechanism for two subunits (OST3 and OST6) of the yeast oligosaccharyltransferase complex. Based on homology, this prediction can be extended to a potential tumor suppressor gene, N33, in humans, whose biochemical function was not previously known. Attempts to obtain a folded, active N33 construct to test the prediction were unsuccessful. The results show that structure prediction coupled with biochemically relevant structural motifs is a powerful method for the function annotation of genome sequences and can provide more detailed, robust predictions than function prediction methods that rely on sequence comparison alone.
In order to circumvent limitations of sequence based methods in the process of making functional predictions for proteins, we have developed a methodology that uses a sequence-to-structure-to-function paradigm. First, an approximate three-dimensional structure is predicted. Then, a threedimensional descriptor of the functional site, termed a Fuzzy Functional Form, or FFF, is used to screen the structure for the presence of the functional site of interest (Fetrow et al., 1998; Fetrow and Skolnick, 1998). Previously, a disulfide oxidoreductase FFF was developed and applied to predicted structures obtained from a small structural database. Here, using a substantially larger structural database, we expand the analysis of the disulfide oxidoreductase FFF to the B. subtilis genome. To ascertain the performance of the FFF, its results are compared to those obtained using both the sequence alignment method BLAST and three local sequence motif databases: PRINTS, Prosite, and Blocks. The FFF method is then compared in detail to Blocks and it is shown that the FFF is more flexible and sensitive in finding a specific function in a set of unknown proteins. In addition, the estimated false positive rate of function prediction is significantly lower using the FFF structural motif, rather than the standard sequence motif methods. We also present a second FFF and describe a specific example of the results of its whole-genome application to D. melanogaster using a newer threading algorithm. Our results from all of these studies indicate that the addition of three-dimensional structural information adds significant value in the prediction of biochemical function of genomic sequences.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.