Understanding the functional consequences of genetic variation, and how it affects complex human disease and quantitative traits, remains a critical challenge for biomedicine. We present an analysis of RNA sequencing data from 1641 samples across 43 tissues from 175 individuals, generated as part of the pilot phase of the Genotype-Tissue Expression (GTEx) project. We describe the landscape of gene expression across tissues, catalog thousands of tissue-specific and shared regulatory expression quantitative trait loci (eQTL) variants, describe complex network relationships, and identify signals from genome-wide association studies explained by eQTLs. These findings provide a systematic understanding of the cellular and biological consequences of human genetic variation and of the heterogeneity of such effects among a diverse set of human tissues.
Transcriptional regulation and posttranscriptional processing underlie many cellular and organismal phenotypes. We used RNA sequence data generated by Genotype-Tissue Expression (GTEx) project to investigate the patterns of transcriptome variation across individuals and tissues. Tissues exhibit characteristic transcriptional signatures that show stability in postmortem samples. These signatures are dominated by a relatively small number of genes—which is most clearly seen in blood—though few are exclusive to a particular tissue and vary more across tissues than individuals. Genes exhibiting high interindividual expression variation include disease candidates associated with sex, ethnicity, and age. Primary transcription is the major driver of cellular specificity, with splicing playing mostly a complementary role; except for the brain, which exhibits a more divergent splicing program. Variation in splicing, despite its stochasticity, may play in contrast a comparatively greater role in defining individual phenotypes.
De novo mutations (DNMs) originating in gametogenesis are an important source of genetic variation. We use a data set of 7,216 autosomal DNMs with resolved parent of origin from whole-genome sequencing of 816 parent-offspring trios to investigate differences between maternally and paternally derived DNMs and study the underlying mutational mechanisms. Our results show that the number of DNMs in offspring increases not only with paternal age, but also with maternal age, and that some genome regions show enrichment for maternally derived DNMs. We identify parent-of-origin-specific mutation signatures that become more pronounced with increased parental age, pointing to different mutational mechanisms in spermatogenesis and oogenesis. Moreover, we find DNMs that are spatially clustered to have a unique mutational signature with no significant differences between parental alleles, suggesting a different mutational mechanism. Our findings provide insights into the molecular mechanisms that underlie mutagenesis and are relevant to disease and evolution in humans.
Clustering of mutations has been observed in cancer genomes as well as for germline de novo mutations (DNMs). We identified 1,796 clustered DNMs (cDNMs) within whole-genome-sequencing data from 1,291 parent-offspring trios to investigate their patterns and infer a mutational mechanism. We found that the number of clusters on the maternal allele was positively correlated with maternal age and that these clusters consisted of more individual mutations with larger intermutational distances than those of paternal clusters. More than 50% of maternal clusters were located on chromosomes 8, 9 and 16, in previously identified regions with accelerated maternal mutation rates. Maternal clusters in these regions showed a distinct mutation signature characterized by C>G transversions. Finally, we found that maternal clusters were associated with processes involving double-strand-breaks (DSBs), such as meiotic gene conversions and de novo deletion events. This result suggested accumulation of DSB-induced mutations throughout oocyte aging as the mechanism underlying the formation of maternal mutation clusters.
Human germline de novo mutations (DNMs) are both a driver of evolution and an important cause of genetic diseases. In the past few years, whole-genome sequencing (WGS) of parentoffspring trios has facilitated the large-scale detection and study of human DNMs, which has led to exciting discoveries. The overarching theme of all of these studies is that the DNMs of an individual are a complex mixture of mutations that arise through different biological processes acting at different times during human development and life. De novo mutationsHuman de novo mutations (DNMs, see Glossary) are germline mutations that newly occurred within one generation. While the vast majority of the genome has been inherited from earlier generations, DNMs provide new genetic variation. The consequences of the new genetic mutation can vary widely. While neutral or advantageous mutations might become established in the genome of our species and thereby contribute to human evolution, changes to crucial genetic sequences can also result in misfunctioning of biological systems, resulting in severe disease. One of the earliest known examples of this was Down syndrome, which is caused by a de novo trisomy of chromosome 21 [1][2][3]. In recent years DNMs have been found to be a prominent cause of neurodevelopmental diseases, including intellectual disability, autism, and schizophrenia [4,5]. The unbiased study of de novo point mutations in humans was for many years hampered by the lack of techniques to scan the entire genome in a cost-effective way. The introduction of next-generation sequencing (NGS) technologies has spurred investigations of DNMs in humans [6]. DNMs can refer to a variety of different mutation types, such as single-nucleotide substitutions, insertions, deletions, and copy-number variants (CNVs). In this review we focus on single-nucleotide mutations and review the progress made in this field since the introduction of WGS, exploring their biology and possible underlying mechanisms (Figure 1, Key Figure ), but not the potential pathological consequences.
Biological mechanisms underlying human germline mutations remain largely unknown. We statistically decompose variation in the rate and spectra of mutations along the genome using volume-regularized nonnegative matrix factorization. The analysis of a sequencing dataset (TOPMed) reveals nine processes that explain the variation in mutation properties between loci. We provide a biological interpretation for seven of these processes. We associate one process with bulky DNA lesions that resolve asymmetrically with respect to transcription and replication. Two processes track direction of replication fork and replication timing, respectively. We identify a mutagenic effect of active demethylation primarily acting in regulatory regions and a mutagenic effect of LINE repeats. We localize a mutagenic process specific to oocytes from population sequencing data. This process appears transcriptionally asymmetric.
Germline de novo mutation clusters arise during oocyte aging in genomic regions with high double-strand-break incidence.
A genome-wide evaluation of the effects of ionizing radiation on mutation induction in the mouse germline has identified multisite de novo mutations (MSDNs) as marker for previous exposure. Here we present the results of a small pilot study of whole genome sequencing in offspring of soldiers who served in radar units on weapon systems that were emitting high-frequency radiation. We found cases of exceptionally high MSDN rates as well as an increased mean in our cohort: While a MSDN mutation is detected in average in 1 out of 5 offspring of unexposed controls, we observed 12 MSDNs in altogether 18 offspring, including a family with 6 MSDNs in 3 offspring. Moreover, we found two translocations, also resulting from neighboring mutations. Our findings indicate that MSDNs might be suited in principle for the assessment of DNA damage from ionizing radiation also in humans. However, as exact person-related dose values in risk groups are usually not available, the interpretation of MSDNs in single families would benefit from larger molecular epidemiologic studies on this new biomarker.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.