Human populations feature both discrete and continuous patterns of variation. Current analysis approaches struggle to jointly identify these patterns because of modelling assumptions, mathematical constraints, or numerical challenges. Here we apply uniform manifold approximation and projection (UMAP), a non-linear dimension reduction tool, to three well-studied genotype datasets and discover overlooked subpopulations within the American Hispanic population, fine-scale relationships between geography, genotypes, and phenotypes in the UK population, and cryptic structure in the Thousand Genomes Project data. This approach is well-suited to the influx of large and diverse data and opens new lines of inquiry in population-scale datasets.
Uniform manifold approximation and projection (UMAP) has been rapidly adopted by the population genetics community to study population structure. It has become common in visualizing the ancestral composition of human genetic datasets, as well as searching for unique clusters of data, and for identifying geographic patterns. Here we give an overview of applications of UMAP in population genetics, provide recommendations for best practices, and offer insights on optimal uses for the technique.
eneticists have known for more than a decade that their focus on people with European ancestry exacerbates health disparities 1. A 2018 analysis of studies looking for genetic variants associated with disease found that under-representation persists: 78% of study participants were of European ancestry, compared to 10% of Asian ancestry and 2% of African ancestry. Other ancestries each represented less than 1% of the total 2. Several projects, such as H3Africa 3 , are starting to increase participation of under-represented groups, both among participants and among researchers. Large biobanks assembled in Europe and North America, combining biological samples with health-related data, also set sampling targets to increase diversity 4,5,6. But even when data from minority groups are available, many researchers discard them 7 .
Cortical arealization arises during neurodevelopment from the confluence of molecular gradients representing patterned expression of morphogens and transcription factors. However, how these gradients relate to adult brain function, and whether they are maintained in the adult brain, remains unknown. Here we uncover three axes of topographic variation in gene expression in the adult human brain that specifically capture previously identified rostral-caudal, dorsal-ventral and medial-lateral axes of early developmental patterning. The interaction of these spatiomolecular gradients i) accurately predicts the location of unseen brain tissue samples, ii) delineates known functional territories, and iii) explains the topographical variation of diverse cortical features. The spatiomolecular gradients are distinct from canonical cortical functional hierarchies differentiating primary sensory cortex from association cortex, but radiate in parallel with the axes traversed by local field potentials along the cortex. We replicate all three molecular gradients in three independent human datasets as well as two non-human primate datasets, and find that each gradient shows a distinct developmental trajectory across the lifespan. The gradients are composed of several well known morphogens (e.g., PAX6 and SIX3), and a small set of genes shared across gradients are strongly enriched for multiple diseases. Together, these results provide insight into the developmental sculpting of functionally distinct brain regions, governed by three robust transcriptomic axes embedded within brain parenchyma.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.