Background
K-mer spectra of DNA sequences contain important information about sequence composition and sequence evolution. We want to reveal the evolution rules of genome sequences by studying the k-mer spectra of genome sequences.
Results
The intrinsic laws of k-mer spectra of 920 genome sequences from primate to prokaryote were analyzed. We found that there are two types of evolution selection modes in genome sequences, named as CG Independent Selection and TA Independent Selection. There is a mutual inhibition relationship between CG and TA independent selections. We found that the intensity of CG and TA independent selections correlates closely with genome evolution and G + C content of genome sequences. The living habits of species are related closely to the independent selection modes adopted by species genomes. Consequently, we proposed an evolution mechanism of genomes in which the genome evolution is determined by the intensities of the CG and TA independent selections and the mutual inhibition relationship. Besides, by the evolution mechanism of genomes, we speculated the evolution modes of prokaryotes in mild and extreme environments in the anaerobic age and the evolving process of prokaryotes from anaerobic to aerobic environment on earth as well as the originations of different eukaryotes.
Conclusion
We found that there are two independent selection modes in genome sequences. The evolution of genome sequence is determined by the two independent selection modes and the mutual inhibition relationship between them.
Nucleosome positioning and remodeling correlate closely with the DNA sequence bias. It is possible that DNA motifs, interacted preferentially with histone octamers, direct the nucleosome positioning. Exploring the complete set of nucleosome binding motifs is of crucial importance for our understanding of the roles of nucleosomes in gene regulation. Based on the 8-mer multimodal spectra of the human genome, a systematic nucleosome binding motif set is inferred. Its structural features and density distributions for different types of sequences are consistent with the published data. Distributions of these motifs around several functional sites could describe the ground state of nucleosome distribution of these regions. Our results strongly support that the predicted nucleosome binding motif set is reliable. Our findings indicate that the recognition of distinct functional sites or sequences could be greatly improved by the help of a nucleosome array.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.