2023
DOI: 10.7554/elife.84874.3
|View full text |Cite
|
Sign up to set email alerts
|

Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations

Abstract: Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges a… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
4
1

Relationship

3
2

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 88 publications
0
7
0
Order By: Relevance
“…We also generated a test set incorporating background selection into this scenario of introgression. To accomplish this, we used stdpopsim version 0.2.0 [82, 83] to generate SLiM scripts simulating negative and background selection using a genetic map [84] and distribution of fitness effects [85] for mutations in exonic regions, and exon annotations (the FlyBase BDGP6.32.51 exons set in stdpopsim, taken from FlyBase [86]) all obtained from D. melanogaster . We then programmatically modified the SLiM scripts to include bidirectional introgression under the same scenario examined above, which each script generating one test replicate of a 1 Mb region with recombination and annotation data taken from chr3L, before running on the central 100 kb of each test example.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…We also generated a test set incorporating background selection into this scenario of introgression. To accomplish this, we used stdpopsim version 0.2.0 [82, 83] to generate SLiM scripts simulating negative and background selection using a genetic map [84] and distribution of fitness effects [85] for mutations in exonic regions, and exon annotations (the FlyBase BDGP6.32.51 exons set in stdpopsim, taken from FlyBase [86]) all obtained from D. melanogaster . We then programmatically modified the SLiM scripts to include bidirectional introgression under the same scenario examined above, which each script generating one test replicate of a 1 Mb region with recombination and annotation data taken from chr3L, before running on the central 100 kb of each test example.…”
Section: Methodsmentioning
confidence: 99%
“…We also investigated the impact of background selection (BGS), the impact of linked negative selection on neutral diversity [97], on IntroUNET's accuracy. To do this, we used stdpopsim [82,83] incorporate background selection (modeled after randomly chosen regions of the D. melanogaster genome) into our simulations of the same introgression scenario described above (see Methods for details). We generated a test set of 1000 regions each 1 Mb in length and ran IntroUNET on the central 100 kb of each region.…”
Section: Introunet Performs Well Under Scenarios Of Model Misspecific...mentioning
confidence: 99%
“…We simulated the trait with h 2 = 1 to obtain true marginal effect sizes. We simulated chromosome 22 using the information from stdpopsim catalog 35,36 . The first five generations were simulated using the Wright-Fisher process and more upstream generations followed the coalescent 29 .…”
Section: Methodsmentioning
confidence: 99%
“…We set this baseline rate to 500 c , where c is the average crossover rate per base pair estimated for a species. We used the following c values which we obtained from version 0.2.0 (Adrion et al ., 2020; Lauterbur et al ., 2023a): 1.30981 × 10 − 8 for humans (the mean value on chromosome 12 using data from Consortium (2007)), 1.7966 e × 10 − 8 for D. melanogaster (the mean on chr3L using data from Comeron et al . (2012)), and 8.06452 × 10 − 10 for A. thaliana (the genome-wide average rate from Huber et al .…”
Section: Methodsmentioning
confidence: 99%