2020
DOI: 10.1038/s41598-020-68259-w
|View full text |Cite
|
Sign up to set email alerts
|

Robust genome-wide ancestry inference for heterogeneous datasets: illustrated using the 1,000 genome project with 3D facial images

Abstract: Estimates of individual-level genomic ancestry are routinely used in human genetics, and related fields. The analysis of population structure and genomic ancestry can yield insights in terms of modern and ancient populations, allowing us to address questions regarding admixture, and the numbers and identities of the parental source populations. Unrecognized population structure is also an important confounder to correct for in genome-wide association studies. However, it remains challenging to work with hetero… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0
2

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2
1

Relationship

3
5

Authors

Journals

citations
Cited by 9 publications
(13 citation statements)
references
References 47 publications
0
11
0
2
Order By: Relevance
“…SUGIBS was previously proposed as a robust alternative against laboratory artifacts and outliers [14] by applying SVD on the IBS generalized genotype matrix, where IBS information corrects for potential artifacts due to errors and missingness.…”
Section: Spectral Decomposition Generalized By Identity-by-state Matrixmentioning
confidence: 99%
“…SUGIBS was previously proposed as a robust alternative against laboratory artifacts and outliers [14] by applying SVD on the IBS generalized genotype matrix, where IBS information corrects for potential artifacts due to errors and missingness.…”
Section: Spectral Decomposition Generalized By Identity-by-state Matrixmentioning
confidence: 99%
“…Loadings are also offered by databases like gnomAD 29 and the UK Biobank 30 . PCA serves as the primary tool to identify the origins of ancient samples in paleogenomics 14 , to identify biomarkers for forensic reconstruction in evolutionary biology 31 , and geolocalize samples 32 . As of April 2022, 32,000-216,000 genetic papers employed PC scatterplots to interpret genetic data, draw historical and ethnobiological conclusions, and describe the evolution of various taxa from prehistorical times to the present—no doubt Herculean tasks for any scatterplot.…”
Section: Introductionmentioning
confidence: 99%
“…However, high-quality genotype data is critical to the success of PCA 10 . In the presence of missing genotypes or errors in the target dataset, PCA has shown to produce patterns of misalignment during projection 14 , 15 . To overcome this problem, a robust alternative was recently proposed, known as SUGIBS, which utilizes spectral (S) decomposition of an unnormalized genomic (UG) relationship matrix generalized by an Identity-by-State (IBS) similarity matrix between the samples to be projected and individuals in the reference dataset 14 .…”
Section: Introductionmentioning
confidence: 99%