Accurate reconstruction of pedigrees from genetic data remains a challenging problem. Pedigree inference algorithms are often trained only on urban European-descent families, which are comparatively 'outbred' compared to many other global populations. Relationship categories can be difficult to distinguish (e.g. half-sibships versus avuncular) without external information. Furthermore, published software cannot accommodate endogamous populations where there may be reticulations within a pedigree (i.e. inbreeding) or elevated haplotype sharing. We design a simple, rapid algorithm which initially uses only high-confidence first degree relationships to seed a machine learning step based on the number of identical by descent segments. Additionally, we define a new statistic to polarize individuals to ancestor versus descendant generation. We test our approach in a sample of 700 individuals from northern Namibia, sampled from an endogamous population. Due to a culture of concurrent relationships in this population, there is a high proportion of half-sibships. We accurately identify first through third degree relationships for all categories, including half-sibships, half-avuncular-ships etc. Accurate reconstruction of pedigrees holds promise for tracing allele frequency trajectories, improved phasing and other population genomic questions.
Recessive alleles have been shown to directly affect both human Mendelian disease phenotypes and complex traits like height. Pedigree studies also suggest that consanguinity results in increased childhood mortality and adverse health phenotypes, presumably through penetrance of recessive mutations. Here, we test whether the accumulation of homozygous, recessive alleles decreases reproductive success in a human population. We address this question among the Namibian Himba, an endogamous agro-pastoralist population, who until very recently practiced natural fertility. Using a sample of 681 individuals, we show that Himba exhibit elevated levels of "inbreeding", calculated as the fraction of the genome in runs of homozygosity (FROH). Many individuals contain multiple long segments of ROH in their genomes, indicating that their parents had high kinship coefficients. However, we did not find evidence that this is explained by first-cousin consanguinity, despite a reported social preference for cross-cousin marriages. Rather, we show that elevated haplotype sharing in the Himba is due to a bottleneck, likely in the past 60 generations. We test whether increased recessive mutation load results in observed fitness consequences by assessing the effects of FROH on completed fertility in a cohort of post-reproductive women (n=69). We find that higher FROH is significantly associated with lower fertility among women who have had at least one child (p<0.006). Our data suggest a multi-locus genetic effect on fitness driven by the expression of deleterious recessive alleles, especially those in long ROH. However, these effects are not the result of consanguinity but rather elevated background identity by descent.
Precision medicine initiatives across the globe have led to a revolution of repositories linking large-scale genomic data with electronic health records, enabling genomic analyses across the entire phenome. Many of these initiatives focus solely on research insights, leading to limited direct benefit to patients. We describe the Biobank at the Colorado Center for Personalized Medicine (CCPM Biobank) that was jointly developed by the University of Colorado Anschutz Medical Campus and UCHealth to serve as a unique, dual-purpose research and clinical resource accelerating personalized medicine. This living resource currently has over 200,000 patients with ongoing recruitment. We highlight the clinical, laboratory, regulatory, and HIPAA-compliant informatics infrastructure along with our stakeholder engagement, consent, recontact, and participant engagement strategies. We characterize aspects of genetic and geographic diversity unique to the Rocky Mountain Region, the primary catchment area for CCPM Biobank participants. We leverage linked health and demographic information of the CCPM Biobank participant population to demonstrate the utility of the CCPM Biobank to replicate complex trait associations in the first 33,674 genotyped patients across multiple disease domains. Finally, we describe our current efforts towards return of clinical genetic test results including high-impact pathogenic variants and pharmacogenetic information, and our broader goals as the CCPM Biobank continues to grow. Bringing clinical and research interests together fosters unique clinical and translational questions that can be addressed from the large EHR-linked CCPM Biobank resource within a HIPAA and CLIA-certified environment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.