2018
DOI: 10.1038/s41437-018-0124-8
|View full text |Cite
|
Sign up to set email alerts
|

Subsampling reveals that unbalanced sampling affects Structure results in a multi-species dataset

Abstract: Studying the genetic population structure of species can reveal important insights into several key evolutionary, historical, demographic, and anthropogenic processes. One of the most important statistical tools for inferring genetic clusters is the program STRUCTURE. Recently, several papers have pointed out that STRUCTURE may show a bias when the sampling design is unbalanced, resulting in spurious joining of underrepresented populations and spurious separation of overrepresented populations. Suggestions to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
32
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
10

Relationship

1
9

Authors

Journals

citations
Cited by 43 publications
(38 citation statements)
references
References 31 publications
1
32
0
Order By: Relevance
“…The objective of this work was not to comment on all of the potential caveats to consider using structure and the Δ K method, as these have been reviewed elsewhere (Gilbert et al, ; Kalinowski, ; Meirmans, ; Puechmaille, ; Waples & Gaggiotti, ). We also were not aiming to answer the question of what parameters define a population as a separate unit (Waples & Gaggiotti, ).…”
Section: Discussionmentioning
confidence: 99%
“…The objective of this work was not to comment on all of the potential caveats to consider using structure and the Δ K method, as these have been reviewed elsewhere (Gilbert et al, ; Kalinowski, ; Meirmans, ; Puechmaille, ; Waples & Gaggiotti, ). We also were not aiming to answer the question of what parameters define a population as a separate unit (Waples & Gaggiotti, ).…”
Section: Discussionmentioning
confidence: 99%
“…2017) was used to automate running the clustering analyses and for parsing the output produced by the different programs. For determining the fit of the clustering results to either the populations (correct) or the ploidy levels (bias), we used the approach of Meirmans (2019). For this, we calculated a test statistic β , which is the absolute value of the variable coefficient (“slope”) of an Analysis of Variance with either population ( β pop ) or ploidy level ( β ploidy ) as explanatory variable and the clustering results at k = 2 as response variable.…”
Section: Methodsmentioning
confidence: 99%
“…In the merged dataset, there were large differences in sample sizes between populations, which could bias the inference of population genetic structure 60 . As recommended by Meirmans 61 , we reduced the sample sizes of larger populations by removing individuals with more than 20% of missing data and then randomly removing excess individuals in order to obtain comparable sample sizes of different wolf and dog populations, as well as similar total number of wolves and dogs. The final dataset included 697 individuals: 83 Asian, 108 European and 119 North American wolves, 160 free-ranging dogs, 180 pure-bred dogs, the Neolithic Irish dog, three Czechoslovakian wolfdogs, two Saarloos wolfdogs, 29 coyotes, eight golden jackals and three black backed jackals.…”
Section: Methodsmentioning
confidence: 99%