2020
DOI: 10.1101/2020.12.04.410670
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

PipeMaster: inferring population divergence and demographic history with approximate Bayesian computation and supervised machine-learning in R

Abstract: Understanding population divergence involves testing diversification scenarios and estimating historical parameters, such as divergence time, population size and migration rate. There is, however, an immense space of possible highly parameterized scenarios that are difsficult or impossible to solve analytically. To overcome this problem researchers have used alternative simulation-based approaches, such as approximate Bayesian computation (ABC) and supervised machine learning (SML), to approximate posterior pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
12
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

4
5

Authors

Journals

citations
Cited by 16 publications
(16 citation statements)
references
References 38 publications
0
12
0
Order By: Relevance
“…While an exact mutation rate is not available for the majority of Malagasy herpetofauna, this standard deviation encompasses mitochondrial mutation rate estimates for closely related herpetofaunal groups reported in the literature (Daza et al, 2010; Gehring et al, 2013; Kurabayashi et al, 2008; Wollenberg et al, 2011; see Table 1; Table S1 for summary of prior values). We generated 500,000 simulations per model (expansion, constant and bottleneck) for all observed populations, with number of base pairs and number of individuals specific to each observed dataset (see Table S1) under random parameter draws from the aforementioned priors in PipeMaster (Gehara et al, 2017; Gehara et al, 2019). We reduced our observed sequence data into the following summary statistics per population: nucleotide diversity ( π ), number of segregating sites (ss), haplotype diversity ( H ), Tajima's D and the first three bins of the site frequency spectrum (ss1, ss2 and ss3); see Table S5 for full descriptions of summary statistics.…”
Section: Methodsmentioning
confidence: 99%
“…While an exact mutation rate is not available for the majority of Malagasy herpetofauna, this standard deviation encompasses mitochondrial mutation rate estimates for closely related herpetofaunal groups reported in the literature (Daza et al, 2010; Gehring et al, 2013; Kurabayashi et al, 2008; Wollenberg et al, 2011; see Table 1; Table S1 for summary of prior values). We generated 500,000 simulations per model (expansion, constant and bottleneck) for all observed populations, with number of base pairs and number of individuals specific to each observed dataset (see Table S1) under random parameter draws from the aforementioned priors in PipeMaster (Gehara et al, 2017; Gehara et al, 2019). We reduced our observed sequence data into the following summary statistics per population: nucleotide diversity ( π ), number of segregating sites (ss), haplotype diversity ( H ), Tajima's D and the first three bins of the site frequency spectrum (ss1, ss2 and ss3); see Table S5 for full descriptions of summary statistics.…”
Section: Methodsmentioning
confidence: 99%
“…We then used ipyrad for de novo assembly (Eaton and Overcast, 2020) and subsequent processing (see Supplementary Table 1 for parameters). Ipyrad outputs a ".alleles" file with phased loci, which we further converted to FASTA format with the function "iPyrad.alleles.loci2fasta" available in the R package PipeMaster (Gehara et al, 2020). 1 Our final dataset consisted of 2,725 unlinked loci (Table 1) from 10 individuals.…”
Section: Data Acquisition and Processingmentioning
confidence: 99%
“…We explored intraspecific diversity and population structure using principal components analysis (PCA) using the R 3.4.4 [45] package adegenet v. 2.1.2 [46], sNMF [47] and NgsAdmix [48]. Using PipeMaster v. 0.0.9 [49], we estimated the mean and variance of seven population genetic summary statistics. Additional methodological details are available in the electronic supplementary material.…”
Section: (B) Genetic Diversitymentioning
confidence: 99%
“…SML is statistically robust and outperforms traditional methods [50]. To specify model parameters, simulate data, and estimate summary statistics, we used the R package PipeMaster v. 0.0.9 [49], which has been used to test alternative evolutionary histories using both approximate Bayesian computation [51] and SML [52]. For all nine species, we independently tested a constant population size model (CS) plus two single-epoch models: population expansion (EXP) and population bottleneck (BN).…”
Section: (C) Demographic Model Selectionmentioning
confidence: 99%