2014
DOI: 10.1534/g3.113.007633
|View full text |Cite
|
Sign up to set email alerts
|

Effects of Sample Selection Bias on the Accuracy of Population Structure and Ancestry Inference

Abstract: Population stratification is an important task in genetic analyses. It provides information about the ancestry of individuals and can be an important confounder in genome-wide association studies. Public genotyping projects have made a large number of datasets available for study. However, practical constraints dictate that of a geographical/ethnic population, only a small number of individuals are genotyped. The resulting data are a sample from the entire population. If the distribution of sample sizes is not… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
19
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 31 publications
(21 citation statements)
references
References 31 publications
2
19
0
Order By: Relevance
“…American ancestry groups are typically younger than 750 generations (15,000 to 22,500 years), consistent with existing knowledge about the settlement of the Americas via the Bering land bridge that connected Asia and North America during the last glacial maximum around 15,000 to 23,000 years ago [45,46]. We note, however, that recent admixture and the sampling strategies of the different data sets [47,48] can have a strong impact on age distributions. For example, variants at high frequency within American populations but that are nevertheless restricted to just American and African populations are, on average, younger than lower-frequency variants (within American populations) with the same geographical restriction (S6 Fig).…”
Section: Distribution Of Allele Age In the Human Genomesupporting
confidence: 85%
“…American ancestry groups are typically younger than 750 generations (15,000 to 22,500 years), consistent with existing knowledge about the settlement of the Americas via the Bering land bridge that connected Asia and North America during the last glacial maximum around 15,000 to 23,000 years ago [45,46]. We note, however, that recent admixture and the sampling strategies of the different data sets [47,48] can have a strong impact on age distributions. For example, variants at high frequency within American populations but that are nevertheless restricted to just American and African populations are, on average, younger than lower-frequency variants (within American populations) with the same geographical restriction (S6 Fig).…”
Section: Distribution Of Allele Age In the Human Genomesupporting
confidence: 85%
“…Variants restricted to American ancestry groups are typically younger than 750 generations (15,000 to 22,500 years), consistent with existing knowledge about the settlement of the Americas via the Bering land bridge that connected Asia and North America during the last glacial maximum around 15,000 to 23,000 years ago [42,43]. We note, however, that recent admixture and the sampling strategies of the different data sets [44,45] can have a strong impact on age distributions. For example, variants at high frequency within American populations, but which are nevertheless restricted to just American and African populations, are on average younger than lower frequency variants (within American populations) with the same geographical restriction (Supplementary Figure S6).…”
Section: Distribution Of Allele Age In the Human Genomesupporting
confidence: 84%
“…7e). However, the limited sampling of these SNP data sets may have biased STRUCTURE results (Shringarpure & Xing, 2014), and we observe very little separation of Foothills region individuals with DAPC (Fig. 3b, c).…”
Section: Population Structurementioning
confidence: 83%