Inferring accurate models describing the relationship between genotype and phenotype is necessary in order to understand and predict how mutations to biological sequences affect the fitness and evolution of living organisms. The apparent abundance of epistasis (genetic interactions), both between and within genes, complicates this task and how to build mechanistic models that incorporate epistatic coefficients (genetic interaction terms) is an open question. The Walsh-Hadamard transform underlies a number of related definitions of epistasis. One of its main limitations is that it can only accommodate two alleles (amino acid or nucleotide states) per sequence position. In this paper we provide an extension of the Walsh-Hadamard transform that allows the calculation and modeling of background-averaged epistasis (also known as ensemble or statistical epistasis) in genetic landscapes with an arbitrary number of states per position (20 for amino acids, 4 for nucleotides, etc.). We also provide a recursive formula for the inverse matrix and then derive formulae to directly extract any element of either matrix without having to rely on the computationally intensive task of constructing or inverting large matrices. Finally, we demonstrate the utility of our theory by using it to model epistasis within a combinatorially complete multiallelic genetic landscape of a tRNA, revealing that both pairwise and higher-order genetic interactions are enriched between physically interacting positions.
We define a new class of Ξ-coalescents characterized by a possibly infinite measure over the non negative integers. We call them symmetric coalescents since they are the unique family of exchangeable coalescents satisfying a symmetry property on their coagulation rates: they are invariant under any transformation that consists in moving one element from one block to another without changing the total number of blocks. We illustrate the diversity of behaviors of this family of processes by introducing and studying a one parameter subclass, the (β, S)-coalescents. We also embed this family in a larger class of Ξ-coalescents arising as the limit genealogies of Wright-Fisher models with bottlenecks. Some convergence results rely on a new Skorokhod type metric, that induces the Meyer-Zheng topology, which allows to study the scaling limit of non-markovian processes using standard techniques.
After admixture, recombination breaks down genomic blocks of contiguous ancestry. The breakdown of these blocks forms a new “molecular clock” that ticks at a much faster rate than the mutation clock, enabling accurate dating of admixture events in the recent past. However, existing theory on the breakdown of these blocks, or the accumulation of delineations between blocks, so‐called “junctions”, has mostly been limited to using regularly spaced markers on phased data. Here, we present an extension to the theory of junctions using the ancestral recombination graph that describes the expected number of junctions for any distribution of markers along the genome. Furthermore, we provide a new framework to infer the time since admixture using unphased data. We demonstrate both the phased and unphased methods on simulated data and show that our new extensions have improved accuracy with respect to previous methods, especially for smaller population sizes and more ancient admixture times. Lastly, we demonstrate the applicability of our method on three empirical data sets, including labcrosses of yeast (Saccharomyces cerevisae) and two case studies of hybridization in swordtail fish and Populus trees.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.