We present data-analytic and statistical tools for studying rates of rearrangement of whole genomes and to assess the stability of these methods with changes in the level of resolution of the genomic data. We construct datasets on the numbers of conserved syntenies and conserved segments shared by pairs of animal genomes at different levels of resolution. We fit these data to an evolutionary tree and find the rates of rearrangement on various evolutionary lineages. We document the lack of clocklike behavior of rearrangement processes, the independence of translocation and inversion rates, and the level of resolution beyond which translocations rates are lost in noise due to other processes.
We study the probability distribution of genomic distance d under the hypothesis of random gene order. We translate the random order assumption into a stochastic method for constructing the alternating color cycles in the decomposition of the bicolored breakpoint graph. For two random genomes of length n, we show that the expectation of n - d is O((1/2) log n).
Abstract. Gene cluster significance tests that are based on the number of genes in a cluster in two genomes, and how compactly they are distributed, but not their order, may be made more powerful by the addition of a test component that focuses solely on the similarity of the ordering of the common genes in the clusters in the two genomes. Here we suggest four such tests, compare them, and investigate one of them, the maximum adjacency disruption criterion, in some detail, analytically and through simulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.