The Wright-Fisher family of diffusion processes is a widely used class of evolutionary models. However, simulation is difficult because there is no known closed-form formula for its transition function. In this article we demonstrate that it is in fact possible to simulate exactly from a broad class of Wright-Fisher diffusion processes and their bridges. For those diffusions corresponding to reversible, neutral evolution, our key idea is to exploit an eigenfunction expansion of the transition function; this approach even applies to its infinitedimensional analogue, the Fleming-Viot process. We then develop an exact rejection algorithm for processes with more general drift functions, including those modelling natural selection, using ideas from retrospective simulation. Our approach also yields methods for exact simulation of the moment dual of the Wright-Fisher diffusion, the ancestral process of an infinite-leaf Kingman coalescent tree. We believe our new perspective on diffusion simulation holds promise for other models admitting a transition eigenfunction expansion. * Supported in part by EPSRC Research Grant EP/L018497/1. MSC 2010 subject classifications: Primary 65C05; secondary 60H35, 60J60, 92D15
We dedicate this paper to Sir John Kingman on his 70th Birthday.In modern mathematical population genetics the ancestral history of a population of genes back in time is described by John Kingman's coalescent tree. Classical and modern approaches model gene frequencies by diffusion processes. This paper, which is partly a review, discusses how coalescent processes are dual to diffusion processes in an analytic and probabilistic sense.Bochner (1954) and Gasper (1972) were interested in characterizations of processes with Beta stationary distributions and Jacobi polynomial eigenfunctions. We discuss the connection with Wright-Fisher diffusions and the characterization of these processes. Subordinated Wright-Fisher diffusions are of this type. An Inverse Gaussian subordinator is interesting and important in subordinated Wright-Fisher diffusions and is related to the Jacobi Poisson Kernel in orthogonal polynomial theory. A related time-subordinated forest of non-mutant edges in the Kingman coalescent is novel.
We present a new model for seed banks, where direct ancestors of individuals may have lived in the near as well as the very far past. The classical Wright-Fisher model, as well as a seed bank model with bounded age distribution considered in Kaj, Krone and Lascoux (2001) are special cases of our model. We discern three parameter regimes of the seed bank age distribution, which lead to substantially different behaviour in terms of genetic variability, in particular with respect to fixation of types and time to the most recent common ancestor. We prove that, for age distributions with finite mean, the ancestral process converges to a time-changed Kingman coalescent, while in the case of infinite mean, ancestral lineages might not merge at all with positive probability. Furthermore, we present a construction of the forward-in-time process in equilibrium. The mathematical methods are based on renewal theory, the urn process introduced in Kaj, Krone and Lascoux (2001) as well as on a paper by Hammond and Sheffield (2013).
A two-types, discrete-time population model with finite, constant size is constructed, allowing for a general form of frequency-dependent selection and skewed offspring distribution. Selection is defined based on the idea that individuals first choose a (random) number of potential parents from the previous generation and then, from the selected pool, they inherit the type of the fittest parent. The probability distribution function of the number of potential parents per individual thus parametrises entirely the selection mechanism. Using duality, weak convergence is then proved both for the allele frequency process of the selectively weak type and for the population's ancestral process. The scaling limits are, respectively, a two-types Ξ-Fleming-Viot jump-diffusion process with frequency-dependent selection, and a branching-coalescing process with general branching and simultaneous multiple collisions. Duality also leads to a characterisation of the probability of extinction of the selectively weak allele, in terms of the ancestral process' ergodic properties.
The frequencies X 1 , X 2 , . . . of an exchangeable Gibbs random partition Π of N = {1, 2, . . .} (Gnedin and Pitman (2006)) are considered in their age-order, i.e. their size-biased order. We study their dependence on the sequence i 1 , i 2 , . . . of least elements of the blocks of Π. In particular, conditioning on 1 = i 1 < i 2 < . . ., a representation is shown to bewhere {ξ j : j = 1, 2, . . .} is a sequence of independent Beta random variables. Sequences with such a product form are called neutral to the left. We show that the property of conditional left-neutrality in fact characterizes the Gibbs family among all exchangeable partitions, and leads to further interesting results on: (i) the conditional Mellin transform of X k , given i k , and (ii) the conditional distribution of the first k normalized frequencies, given k j=1 X j and i k ; the latter turns out to be a mixture of Dirichlet distributions. Many of the mentioned representations are extensions of Griffiths and Lessard (2005) results on Ewens' partitions.
We consider a multivariate version of the so-called Lancaster problem of characterizing canonical correlation coefficients of symmetric bivariate distributions with identical marginals and orthogonal polynomial expansions. The marginal distributions examined in this paper are the Dirichlet and the Dirichlet multinomial distribution, respectively, on the continuous and the Ndiscrete d-dimensional simplex. Their infinite-dimensional limit distributions, respectively, the Poisson-Dirichlet distribution and Ewens's sampling formula, are considered as well. We study, in particular, the possibility of mapping canonical correlations on the d-dimensional continuous simplex (i) to canonical correlation sequences on the d + 1-dimensional simplex and/or (ii) to canonical correlations on the discrete simplex, and vice versa. Driven by this motivation, the first half of the paper is devoted to providing a full characterization and probabilistic interpretation of n-orthogonal polynomial kernels (i.e., sums of products of orthogonal polynomials of the same degree n) with respect to the mentioned marginal distributions. We establish several identities and some integral representations which are multivariate extensions of important results known for the case d = 2 since the 1970s. These results, along with a common interpretation of the mentioned kernels in terms of dependent Pólya urns, are shown to be key features leading to several non-trivial solutions to Lancaster's problem, many of which can be extended naturally to the limit as d → ∞.
We study weighted particle systems in which new generations are resampled from current particles with probabilities proportional to their weights. This covers a broad class of sequential Monte Carlo (SMC) methods, widely-used in applied statistics and cognate disciplines. We consider the genealogical tree embedded into such particle systems, and identify conditions, as well as an appropriate time-scaling, under which they converge to the Kingman n-coalescent in the infinite system size limit in the sense of finite-dimensional distributions. Thus, the tractable n-coalescent can be used to predict the shape and size of SMC genealogies, as we illustrate by characterising the limiting mean and variance of the tree height. SMC genealogies are known to be connected to algorithm performance, so that our results are likely to have applications in the design of new methods as well. Our conditions for convergence are strong, but we show by simulation that they do not appear to be necessary.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.