1 Prokaryote genomes exhibit a wide range of GC contents and codon usages, both 2 resulting from an interaction between mutational bias and natural selection. In order to 3 investigate the basis underlying specific codon changes, we performed a comprehensive 4 analysis of 29-different prokaryote families. The analysis of core-gene sets with 5 increasing ancestries in each family lineage revealed that the codon usages became 6 progressively more adapted to the tRNA pools. While, as previously reported, highly-7 expressed genes presented the more optimized codon usage, the singletons contained 8 the less selectively-favored codons. Results showed that usually codons with the highest 9 translational adaptation were preferentially enriched. In agreement with previous reports, 10 a C-bias in 2-to 3-fold codons, and a U-bias in 4-fold codons occurred in all families, 11 irrespective of the global genomic-GC content. Furthermore, the U-biases suggested that 12 U 3 -mRNA-U 34 -tRNA interactions were responsible for a prominent codon optimization in 13 both the more ancestral core and the highly expressed genes. A comparative analysis of 14 sequences that encode conserved-(cr) or variable-(vr) translated products, with each one 15 being under high-(HEP) and low-(LEP) expression levels, demonstrated that the 16 efficiency was more relevant (by a factor of 2) than accuracy to modelling codon usage.
17Finally, analysis of the third position of codons (GC3) revealed that, in genomes of global-18 GC contents higher than 35-40%, selection favored a GC3 increase; whereas in 19 genomes with very low-GC contents, a decrease in GC3 occurred. A comprehensive final 20 model is presented where all patterns of codon usage variations are condensed in five-21 distinct behavioral groups. 22 3 IMPORTANCE
24The prokaryotic genomes-the current heritage of the more ancient life forms on earth-25 are comprised of diverse gene sets; all characterized by varied origins, ancestries, and 26 spatial-temporal-expression patterns. Such genetic diversity has for a long time raised 27 the question of how cells shape their coding strategies to optimize protein demands (i.e.,
28product abundance) and accuracy (i.e., translation fidelity) through the use of the same 29 genetic code in genomes with GC-contents that range from less than 20 to over 80%. In 30 this work, we present evidence on how codon usage is adjusted in the prokaryote tree of 31 life, and on how specific biases have operated to improve translation. Through the use of 32 proteome data, we characterized conserved and variable sequence domains in genes of 33 either high-or low-expression level, and quantitated the relative weight of efficiency and 34 accuracy-as well as their interaction-in shaping codon usage in prokaryotes. 35 4 INTRODUCTION 36 37The wide range of GC contents exhibited by prokaryote genomes-i. e., from less 38 than 20 to 80%-are believed to be primarily caused by interspecies differences in 39 mutational processes that operate on both the coding and the noncoding regions (...