O-GalNAc-glycosylation is one of the main types of glycosylation in mammalian cells. No consensus recognition sequence for the O-glycosyltransferases is known, making prediction methods necessary to bridge the gap between the large number of known protein sequences and the small number of proteins experimentally investigated with regard to glycosylation status. From O-GLYCBASE a total of 86 mammalian proteins experimentally investigated for in vivo O-GalNAc sites were extracted. Mammalian protein homolog comparisons showed that a glycosylated serine or threonine is less likely to be precisely conserved than a nonglycosylated one. The Protein Data Bank was analyzed for structural information, and 12 glycosylated structures were obtained. All positive sites were found in coil or turn regions. A method for predicting the location for mucin-type glycosylation sites was trained using a neural network approach. The best overall network used as input amino acid composition, averaged surface accessibility predictions together with substitution matrix profile encoding of the sequence. To improve prediction on isolated (single) sites, networks were trained on isolated sites only. The final method combines predictions from the best overall network and the best isolated site network; this prediction method correctly predicted 76% of the glycosylated residues and 93% of the nonglycosylated residues. NetOGlyc 3.1 can predict sites for completely new proteins without losing its performance. The fact that the sites could be predicted from averaged properties together with the fact that glycosylation sites are not precisely conserved indicates that mucin-type glycosylation in most cases is a bulk property and not a very site-specific one. NetOGlyc 3.1 is made available at www.cbs.dtu.dk/services/netoglyc.
C-mannosylation is the attachment of an alpha-mannopyranose to a tryptophan via a C-C linkage. The sequence WXXW, in which the first Trp becomes mannosylated, has been suggested as a consensus motif for the modification, but only two-thirds of known sites follow this rule. We have gathered a data set of 69 experimentally verified C-mannosylation sites from the literature. We analyzed these for sequence context and found that apart from Trp in position +3, Cys is accepted in the same position. We also find a clear preference in position +1, where a small and/or polar residue (Ser, Ala, Gly, and Thr) is preferred and a Phe or a Leu residue discriminated against. The Protein Data Bank was searched for structural information, and five structures of C-mannosylated proteins were obtained. We showed that modified tryptophan residues are at least partly solvent exposed. A method predicting the location of C-mannosylation sites in proteins was developed using a neural network approach. The best overall network used a 21-residue sequence input window and information on the presence/absence of the WXXW motif. NetCGlyc 1.0 correctly predicts 93% of both positive and negative C-mannosylation sites. This is a significant improvement over the WXXW consensus motif itself, which only identifies 67% of positive sites. NetCGlyc 1.0 is available at http://www.cbs.dtu.dk/services/NetCGlyc/. Using NetCGlyc 1.0, we scanned the human genome and found 2573 exported or transmembrane transcripts with at least one predicted C-mannosylation site.
Some proteins are highly conserved across all species, whereas others diverge significantly even between closely related species. Attempts have been made to correlate the rate of protein evolution to amino acid composition, protein dispensability, and the number of protein-protein interactions, but in all cases, conflicting studies have shown that the theories are hard to confirm experimentally. The only correlation that is undisputed so far is that highly/broadly expressed proteins seem to evolve at a lower rate. Consequently, it has been suggested that correlations between evolution rate and factors like protein dispensability or the number of protein-protein interactions could be just secondary effects due to differences in expression. The purpose of this study was to analyze mammalian proteins/genes with known subcellular location for variations in evolution rates. We show that proteins that are exported (extracellular proteins) evolve faster than proteins that reside inside the cell (intracellular proteins). We find weak, but significant, correlations between evolution rates and expression levels, percentage of tissues in which the proteins are expressed (expression broadness), and the number of protein interaction partners. More important, we show that the observed difference in evolution rate between extra- and intracellular proteins is largely independent of expression levels, expression broadness, and the number of protein-protein interactions. We also find that the difference is not caused by an overrepresentation of immunological proteins or disulfide bridge-containing proteins among the extracellular data set. We conclude that the subcellular location of a mammalian protein has a larger effect on its evolution rate than any of the other factors studied in this paper, including expression levels/patterns. We observe a difference in evolution rates between extracellular and intracellular proteins for a yeast data set as well and again show that it is completely independent of expression levels.
Interactions that stabilize the native state of a protein have been studied by measuring the affinity between subdomain fragments with and without site-specific residue substitutions. A calbindin D(9k) variant with a single CNBr cleavage site at position 43 between its two EF-hand subdomains was used as a starting point for the study. Into this variant were introduced 11 site-specific substitutions involving hydrophobic core residues at the interface between the two EF-hands. The mutants were cleaved with CNBr to produce wild-type and mutated single-EF-hand fragments: EF1 (residues 1--43) and EF2 (residues 44--75). The interaction between the two EF-hands was studied using surface plasmon resonance (SPR) technology, which follows the rates of association and dissociation of the complex. Wild-type EF1 was immobilized on a dextran matrix, and the wild-type and mutated versions of EF2 were injected at several different concentrations. In another set of experiments, wild-type EF2 was immobilized and wild-type or mutant EF1 was injected. Dissociation rate constants ranged between 1.1 x 10(-5) and 1.0 x 10(-2) s(-1) and the association rate constants between 2 x 10(5) and 4.0 x 10(6) M(-1) s(-1). The affinity between EF1 and EF2 was as high as 3.6 x 10(11) M(-1) when none of them was mutated. For the 11 hydrophobic core mutants, a strong correlation (r = 0.999) was found between the affinity of EF1 for EF2 and the stability toward denaturation of the corresponding intact protein. The observed correlation implies that the factors governing the stability of the intact protein also contribute to the affinity of the bimolecular EF1-EF2 complex. In addition, the data presented here show that interactions among hydrophobic core residues are major contributors both to the affinity between the two EF-hand subdomains and to the stability of the intact domain.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.