Fields as diverse as human genetics and sociology are increasingly using polygenic scores based on genome-wide association studies (GWAS) for phenotypic prediction. However, recent work has shown that polygenic scores have limited portability across groups of different genetic ancestries, restricting the contexts in which they can be used reliably and potentially creating serious inequities in future clinical applications. Using the UK Biobank data, we demonstrate that even within a single ancestry group (i.e., when there are negligible differences in linkage disequilibrium or in causal alleles frequencies), the prediction accuracy of polygenic scores can depend on characteristics such as the socio-economic status, age or sex of the individuals in which the GWAS and the prediction were conducted, as well as on the GWAS design. Our findings highlight both the complexities of interpreting polygenic scores and underappreciated obstacles to their broad use.
The sources of human germline mutations are poorly understood. Part of the difficulty is that mutations occur very rarely, and so direct pedigree-based approaches remain limited in the numbers that they can examine. To address this problem, we consider the spectrum of low-frequency variants in a dataset (Genome Aggregation Database, gnomAD) of 13,860 human X chromosomes and autosomes. X-autosome differences are reflective of germline sex differences and have been used extensively to learn about male versus female mutational processes; what is less appreciated is that they also reflect chromosome-level biochemical features that differ between the X and autosomes. We tease these components apart by comparing the mutation spectrum in multiple genomic compartments on the autosomes and between the X and autosomes. In so doing, we are able to ascribe specific mutation patterns to replication timing and recombination and to identify differences in the types of mutations that accrue in males and females. In particular, we identify C > G as a mutagenic signature of male meiotic double-strand breaks on the X, which may result from late repair. Our results show how biochemical processes of damage and repair in the germline interact with sex-specific life history traits to shape mutation patterns on both the X chromosome and autosomes.
Causal loss-of-function (LOF) variants for Mendelian and severe complex diseases are enriched in 'mutation intolerant' genes. We show how such observations can be interpreted in light of a model of mutation-selection balance, and use the model to relate the pathogenic consequences of LOF mutations at present-day to their evolutionary fitness effects. To this end, we first infer posterior distributions for the fitness costs of LOF mutations in 17,318 autosomal and 679 X-linked genes from exome sequences in 56,855 individuals. Estimated fitness costs for the loss of a gene copy are typically above 1%; they tend to be largest for X-linked genes, whether or not they have a Y homolog, followed by autosomal genes and genes in the pseudoautosomal region. We then compare inferred fitness effects for all possible de novo LOF mutations to those of de novo mutations identified in individuals diagnosed with one of six severe, complex diseases or developmental disorders. Probands carry an excess of mutations with estimated fitness effects above 10%; as we show by simulation, when sampled in the population, such highly deleterious mutations are typically only a couple of generations old. Moreover, the proportion of highly deleterious mutations carried by probands reflects the typical age of onset of the disease. The study design also has a discernible influence: a greater proportion of highly deleterious mutations is detected in pedigree than case-control studies, and for autism, in simplex than multiplex families and in female versus male probands. Thus, anchoring observations in human genetics to a population genetic model allows us to learn about the fitness effects of mutations identified by different mapping strategies and for different traits.
Whole exome sequences have now been collected for millions of humans, with the related goals of identifying pathogenic mutations in patients and establishing reference repositories of data from unaffected individuals. As a result, we are approaching an important limit, in which datasets are large enough that, in the absence of natural selection, every highly mutable site will have experienced at least one mutation in the genealogical history of the sample. Here, we focus on CpG sites that are methylated in the germline and experience mutations to T at an elevated rate of ~10-7 per site per generation; considering synonymous mutations in a sample of 390,000 individuals, ~99% of such CpG sites harbor a C/T polymorphism. Methylated CpG sites provide a natural mutation saturation experiment for fitness effects: as we show, at current sample sizes, not seeing a non-synonymous polymorphism is indicative of strong selection against that mutation. We rely on this idea in order to directly identify a subset of CpG transitions that are likely to be highly deleterious, including ~27% of possible loss-of-function mutations, and up to 20% of possible missense mutations, depending on the type of functional site in which they occur. Unlike methylated CpGs, most mutation types, with rates on the order of 10-8 or 10-9, remain very far from saturation. We discuss what these findings imply for interpreting the potential clinical relevance of mutations from their presence or absence in reference databases and for inferences about the fitness effects of new mutations.
Functionally redundant genes present a puzzle as to their evolutionary preservation, and offer an interesting opportunity for molecular specialization. In , either one of two presenilin genes ( or ) facilitate Notch activation, providing the catalytic subunit for the γ secretase proteolytic enzyme complex. For all known Notch signaling events, can mediate Notch activation, so the conservation of remains a mystery. Here, we uncover a novel "late-onset" germline Notch phenotype in which HOP-1-deficient worms fail to maintain proliferating germline stem cells during adulthood. Either SEL-12 or HOP-1 presenilin can impart sufficient Notch signaling for the establishment and expansion of the germline, but maintenance of an adult stem cell pool relies exclusively on HOP-1-mediated Notch signaling. We also show that HOP-1 is necessary for maximum fecundity and reproductive span. The low-fecundity phenotype of mutants can be phenocopied by switching off /Notch function during the last stage of larval development. We propose that at the end of larval development, dual presenilin usage switches exclusively to HOP-1, perhaps offering opportunities for differential regulation of the germline during adulthood. Additional defects in oocyte size and production rate in and mutants indicate that the process of oogenesis is compromised when germline Notch signaling is switched off. We calculate that in wild-type adults, as much as 86% of cells derived from the stem cell pool function to support oogenesis. This work suggests that an important role for Notch signaling in the adult germline is to furnish a large and continuous supply of nurse cells to support the efficiency of oogenesis.
The sources of human germline mutations are poorly understood. Part of the difficulty is that mutations occur very rarely, and so direct pedigree-based approaches remain limited in the numbers that they can examine. To address this problem, we consider the spectrum of low frequency variants in a dataset (gnomAD) of 13,860 human X chromosomes and autosomes. X-autosome differences are reflective of germline sex differences, and have been used extensively to learn about male versus female mutational processes; what is less appreciated is that they also reflect chromosome-level biochemical features that differ between the X and autosomes. We tease these components apart by comparing the mutation spectrum in multiple genomic compartments on the autosomes and between the X and autosomes. In so doing, we are able to ascribe specific mutation patterns to replication timing and recombination, and to identify differences in the types of mutations that accrue in males and females. In particular, we identify C>G as a mutagenic signature of male meiotic double strand breaks on the X, which may result from late repair. Our results show how biochemical processes of damage and repair in the germline interact with sex-specific life history traits to shape mutation patterns on both the X chromosome and autosomes.
New emerging infectious diseases are identified every year, a subset of which become global pandemics like COVID-19. In the case of COVID-19, many governments have responded to the ongoing pandemic by imposing social policies that restrict contacts outside of the home, resulting in a large fraction of the workforce either working from home or not working. To ensure essential services, however, a substantial number of workers are not subject to these limitations, and maintain many of their pre-intervention contacts. To explore how contacts among such “essential” workers, and between essential workers and the rest of the population, impact disease risk and the effectiveness of pandemic control, we evaluated several mathematical models of essential worker contacts within a standard epidemiology framework. The models were designed to correspond to key characteristics of cashiers, factory employees, and healthcare workers. We find in all three models that essential workers are at substantially elevated risk of infection compared to the rest of the population, as has been documented, and that increasing the numbers of essential workers necessitates the imposition of more stringent controls on contacts among the rest of the population to manage the pandemic. Importantly, however, different archetypes of essential workers differ in both their individual probability of infection and impact on the broader pandemic dynamics, highlighting the need to understand and target intervention for the specific risks faced by different groups of essential workers. These findings, especially in light of the massive human costs of the current COVID-19 pandemic, indicate that contingency plans for future epidemics should account for the impacts of essential workers on disease spread.
Whole exome sequences have now been collected for millions of humans, with the related goals of identifying pathogenic mutations in patients and establishing reference repositories of data from unaffected individuals. As a result, we are approaching an important limit, in which datasets are large enough that, in the absence of natural selection, every highly mutable site will have experienced at least one mutation in the genealogical history of the sample. Here, we focus on putatively-neutral, synonymous CpG sites that are methylated in the germline and experience mutations to T at an elevated rate of ~10-7 per site per generation; in a sample of 390,000 individuals, ~99% of such CpG sites harbor a C/T polymorphism. These CpG sites provide a natural mutation saturation experiment for fitness effects: as we show, at current sample sizes, not seeing a polymorphism is indicative of strong selection against that mutation. We rely on this idea in order to directly identify a subset of highly deleterious CpG transitions, including ~27% of possible loss-of-function mutations, and up to 21% of possible missense mutations, depending on the type of site in which they occur. Unlike methylated CpGs, most mutation types, with rates on the order of 10-8 or 10-9, remain very far from saturation. We discuss what this contrast implies about interpreting the potential clinical relevance of mutations from their presence or absence in reference databases and for inferences about the fitness effects of new mutations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.