An exhaustive statistical analysis of the amino acid sequences at the carboxyl (C) and amino (N) termini of proteins and of coding nucleic acid sequences at the 5' side of the stop codons was undertaken. At the N ends, Met and Ala residues are over-represented at the first (+1) position whereas at positions 2 and 5 Thr is preferred. These peculiarities at N-termini are most probably related to the mechanism of initiation of translation (for Met) and to the mechanisms governing the life-span of proteins via regulation of their degradation (for Ala and Thr). We assume that the C-terminal bias facilitates fixation of the C ends on the protein globule by a preference for charged and Cys residues. The terminal biases, a novel feature of protein structure, have to be taken into account when molecular evolution, three-dimensional structure, initiation and termination of translation, protein folding and life-span are concerned. In addition, the bias of protein termini composition is an important feature which should be considered in protein engineering experiments.
We have undertaken an exhaustive statistical analysis of the amino acid sequences at the carbow l-terininal (C) ends of proteins. The composition of the C-terminal decapeptides differs from that expected for the given proteins from the overall amino acid composition. For E. coli, yeast, and H. sapiens it was shown that positively charged amino acid residues are over-represented while Gly residues are under-represented. The C-terminal bias, a novel feature of protein structure, should be taken into account when molecular evolution, spatial structure, translational termination and protein folding are concerned.
A model for prediction of ␣-helical regions in amino acid sequences has been tested on the mainly-␣ protein structure class. The modeling represents the construction of a continuous hypothetical ␣-helical conformation for the whole protein chain, and was performed using molecular mechanics tools. The positive prediction of ␣-helical and non-␣-helical pentapeptide fragments of the proteins is 79%. The model considers only local interactions in the polypeptide chain without the influence of the tertiary structure. It was shown that the local interaction defines the ␣-helical conformation for 85% of the native ␣-helical regions. The relative energy contributions to the energy of the model were analyzed with the finding that the van der Waals component determines the formation of ␣-helices. Hydrogen bonds remain at constant energy independently whether ␣-helix or non-␣-helix occurs in the native protein, and do not determine the location of helical regions. In contrast to existing methods, this approach additionally permits the prediction of conformations of side chains. The model suggests the correct values for ∼60% of all -angles of ␣-helical residues.Keywords: Protein secondary structure; structure prediction; ␣-helix; side-chain conformation; -angles; molecular mechanics Although the ␣-helical conformation of the polypeptide chain was successfully predicted by Pauling and coworkers as early as 1951 (Pauling et al. 1951), the role of different forces that determine the helix formation is still being studied. The ␣-helix was shown as the most stable and energetically favorable configuration of the protein polypeptide chain. Pauling made this remarkable prediction on the basis of calculations of the optimal van der Waals interactions of the main-chain atoms with each other and with the sidechain atoms. At the same time, the principle of the maximal saturation of interpeptide hydrogen bonds was proposed. In the ␣-helix in its canonical Pauling form, all hydrogen bonds were formed by intramolecular interactions. The interactions between side chains were not discussed at that time.In native proteins, the order of amino acids in the sequence influences the presence of the ␣-helical conformation, as exemplified by the statistics of residue distribution along the ␣-helices (Kumar and Bansal 1998). Nevertheless, those secondary structure prediction methods, which are based only on residue propensities derived from occurrence statistics, have failed to predict above 65% accuracy for a single polypeptide chain (Barton 1995). As a consequence, prediction algorithms were developed that used evolutionary information and were based on homology alignments (Rost 2001). The significant role of side-chain interactions for ␣-helix formation was described recently and analyzed extensively, both theoretically and experimentally (Fisinger These two authors contributed equally to this work. Article and publication are at http://www.proteinscience.org/cgi
We have developed a new type of microarray, restriction site tagged (RST), for example NotI, microarrays. In this approach only sequences surrounding specific restriction sites (i.e. NotI linking clones) were used for generating microarrays. DNA was labeled using a new procedure, NotI representation, where only sequences surrounding NotI sites were labeled. Due to these modifications, the sensitivity of RST microarrays increases several hundred-fold compared to that of ordinary genomic microarrays. In a pilot experiment we have produced NotI microarrays from Gram-positive and Gram-negative bacteria and have shown that even closely related Escherichia coli strains can be easily discriminated using this technique. For example, two E.coli strains, K12 and R2, differ by less than 0.1% in their 16S rRNA sequences and thus the 16S rRNA sequence would not easily discriminate between these strains. However, these strains showed distinctly different hybridization patterns with NotI microarrays. The same technique can be adapted to other restriction enzymes as well. This type of microarray opens the possibility not only for studies of the normal flora of the gut but also for any problem where quantitative and qualitative analysis of microbial (or large viral) genomes is needed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.