The huge variation in the genomic guanine plus cytosine content (GC%) among prokaryotes has been explained by two mutually exclusive hypotheses, namely, selectionist and neutralist. The former proposals have in common the assumption that this feature is a form of adaptation to some ecological or physiological condition. On the other hand, the neutralist interpretation states that the variations are due only to different mutational biases. Since all of the traits that have been proposed by the selectionists either appeared to be limited to certain genera or were invalidated by the availability of more data, they cannot be considered as a selective force influencing the genomic GC% across all prokaryotes. In this report we show that aerobic prokaryotes display a significant increment in genomic GC% in relation to anaerobic ones. This is the first time that a link between a metabolic character and GC% has been found, independently of phylogenetic relationships and with a statistically significant amount of data.
In prokaryotes, GC levels range from 25% to 75%, and T opt from %0°C to >100°C. When all species are considered together, no correlation is found between the two variables. Correlations are found, however, when Families of prokaryotes are analysed. Indeed, when Families comprising at least 10 species were studied (a set of 20 Families), positive correlations are found for 15 of them. Furthermore, a comparative analysis by independent contrasts made within the Families in order to control for phylogenetic non-independence showed qualitatively equivalent results. We conclude that T opt is one of the factors that influences genomic GC in prokaryotes.
The patterns of synonymous codon choices of the completely sequenced genome of the bacterium Chlamydia trachomatis were analysed. We found that the most important source of variation among the genes results from whether the sequence is located on the leading or lagging strand of replication, resulting in an over representation of G or C, respectively. This can be explained by different mutational biases associated to the different enzymes that replicate each strand. Next we found that most highly expressed sequences are located on the leading strand of replication. From this result, replicational-transcriptional selection can be invoked. Then, when the genes located on the leading strand are studied separately, the correspondence analysis detects a principal trend which discriminates between lowly and highly expressed sequences, the latter displaying a different codon usage pattern than the former, suggesting selection for translation, which is reinforced by the fact that Ks values between orthologous sequences from C. trachomatis and Chlamydia pneumoniae are much smaller in highly expressed genes. Finally, synonymous codon choices appear to be influenced by the hydropathy of each encoded protein and by the degree of amino acid conservation. Therefore, synonymous codon usage in C.trachomatis seems to be the result of a very complex balance among different factors, which rises the problem of whether the forces driving codon usage patterns among microorganisms are rather more complex than generally accepted.
In unicellular species codon usage is determined by mutational biases and natural selection. Among prokaryotes, the influence of these factors is different if the genome is skewed towards AT or GC, since in AT-rich organisms translational selection is absent. On the other hand, in AT-rich unicellular eukaryotes the two factors are present. In order to understand if GC-rich genomes display a similar behavior, the case of Chlamydomonas reinhardtii was studied. Since we found that translational selection strongly influences codon usage in this species, we conclude that there is not a common pattern among unicellular organisms. ß
We have analyzed the patterns of synonymous codon preferences of the nuclear genes of Plasmodium falciparum, a unicellular parasite characterized by an extremely GC-poor genome. When all genes are considered, codon usage is strongly biased toward A and T in third codon positions, as expected, but multivariate statistical analysis detects a major trend among genes. At one end genes display codon choices determined mainly by the extreme genome composition of this parasite, and very probably their expression level is low. At the other end a few genes exhibit an increased relative usage of a particular subset of codons, many of which are C-ending. Since the majority of these few genes is putatively highly expressed, we postulate that the increased C-ending codons are translationally optimal. In conclusion, while codon usage of the majority of P. falciparum genes is determined mainly by compositional constraints, a small number of genes exhibit translational selection.
Abstract. The usage of synonymous codons and the frequencies of amino acids were investigated in the complete genome of the bacterium Thermotoga maritima using a multivariate statistical approach. The GC3 content of each gene was the most prominent source of variation of codon usage. Surprisingly the usage of UGU and UGC (synonymous triplets coding for Cys, the least frequent amino acid in this species) was detected as the second most prominent source of variation. However, this result is probably an artifact due to the very low frequency of Cys together with the nonbiased composition of this genome. The third trend was related to the preferential usage of a subset of codons among highly expressed genes, and these triplets are presumed to be translationally optimal. Concerning the amino acid usage, the hydropathy level of each protein (and therefore the frequency of charged residues) was the main trend, while the second factor was related to the frequency of usage of the smaller residues, suggesting that the cell economy strongly influences the architecture of the proteins. The third axis of the analysis discriminated the usage of Phe, Tyr, Trp (aromatic residues) plus Cys, Met, and His. These six residues have in common the property of being the preferential targets of reactive oxygen species, and therefore the anaerobic condition of T. maritima is an important factor for the amino acid frequencies. Finally, the Cys content of each protein was the fourth trend.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.