Integrative methods, like colocalization and transcriptome-wide association studies (TWAS), identify transcriptomic mechanisms at only a small fraction of trait-associated genetic loci from genome-wide association studies (GWAS). Here, we show that a reliance on reference functional genomics panels of only total gene expression greatly contributes to this reduced discovery. This is particularly relevant for neuropsychiatric traits, as the brain expresses extensive, complex, and unique alternative splicing patterns giving rise to multiple genetically-regulated transcript-isoforms per gene. We introduce isoTWAS, a scalable, multivariate framework to integrate genetics, isoform-level expression, and phenotypic associations in a step-wise testing framework. Multivariate predictive models were trained using isoform-level expression across tissues from the Genotype-Tissue Expression Project and in the developing and adult human brain from PsychENCODE. Across five neuropsychiatric traits, isoTWAS dramatically increased discovery of trait associations within GWAS loci, capturing 92 unique loci compared with 27 using gene-level TWAS. These power gains reflected a ~2-fold increase in the number of testable genes, an ~15-35% increase in total gene expression prediction accuracy, and the ability to jointly capture expression and splicing mechanisms. Results from extensive simulations showed no increase in false discovery rate and reinforce isoTWAS's advantages in prediction and trait mapping power over TWAS, especially when genetic effects on expression vary across isoforms of the same gene. We illustrate multiple biologically-relevant isoTWAS-identified trait associations undetectable by gene-level methods, including isoforms of AKT3, GIGFY2, and KMT5A with schizophrenia risk. The isoTWAS framework addresses an unmet need to consider the transcriptome on the transcript-isoform level to maximize discovery of trait associations, especially for brain-relevant traits.
Genomic regulatory elements active in the developing human brain are notably enriched in genetic risk for neuropsychiatric disorders, including autism spectrum disorder (ASD), schizophrenia, and bipolar disorder. However, prioritizing the specific risk genes and candidate molecular mechanisms underlying these genetic enrichments has been hindered by the lack of a single unified large-scale gene regulatory atlas of human brain development. Here, we uniformly process and systematically characterize gene, isoform, and splicing quantitative trait loci (xQTLs) in 672 fetal brain samples from unique subjects across multiple ancestral populations. We identify 15,752 genes harboring a significant xQTL and map 3,739 eQTLs to a specific cellular context. We observe a striking drop in gene expression and splicing heritability as the human brain develops. Isoform-level regulation, particularly in the second trimester, mediates the greatest proportion of heritability across multiple psychiatric GWAS, compared with eQTLs. Via colocalization and TWAS, we prioritize biological mechanisms for ~60% of GWAS loci across five neuropsychiatric disorders, nearly two-fold that observed in the adult brain. Finally, we build a comprehensive set of developmentally regulated gene and isoform co-expression networks capturing unique genetic enrichments across disorders. Together, this work provides a comprehensive view of genetic regulation across human brain development as well as the stage- and cell type-informed mechanistic underpinnings of neuropsychiatric disorders.
Human brain development is under tight molecular genetic control and the recent advent of single-cell genomics has revolutionized our ability to elucidate the diverse underlying cell-types and states. Although RNA splicing is highly prevalent in the brain and has strong links to neuropsychiatric disorders, previous work has not systematically investigated the role of cell-type-specific splicing or transcript-isoform diversity during human brain development. Here, we leverage single molecule long-read sequencing to deeply profile the full-length transcriptome of the germinal zone (GZ) and cortical plate (CP) regions of the developing human neocortex at tissue and single-cell resolution. We identify 214,516 unique isoforms, corresponding to 22,391 genes. Remarkably, we find that 72.6% of these are novel and together with >7,000 novel-spliced exons expands the proteome by 92,422 proteoforms. We uncover myriad novel isoform switches during cortical neurogenesis, implicating previously-uncharacterized RNA-binding protein-mediated and other regulatory mechanisms in cellular identity and disease. Early-stage excitatory neurons exhibit the greatest isoform diversity and isoform-based single-cell clustering identifies previously uncharacterized cell states. Leveraging this resource, we re-prioritize thousands of rare de novo risk variants associated with neurodevelopmental disorders (NDDs) and reveal that risk genes are strongly associated with the number of unique isoforms observed per gene. Altogether, this work uncovers a substantial contribution of transcript-isoform diversity in cellular identity in the developing neocortex, elucidates novel genetic risk mechanisms for neurodevelopmental and neuropsychiatric disorders, and provides a comprehensive isoform-centric gene annotation for the developing human brain.
Multivariate variance components linear mixed models are fundamental statistical models in quantitative genetics, widely used to quantify SNP-based heritability (h2SNP) and genetic correlation (rg) across complex traits. However, maximum likelihood estimation of multivariate variance components models remains numerically challenging when the number of traits and variance components are both greater than two. To address this critical gap, here we introduce a novel statistical method for fitting multivariate variance components models. This method improves on existing methods by allowing for arbitrary number of traits and/or variance components. We illustrate the utility of our method by characterizing for the first time the genetic architecture of isoform expression in the human brain, modeling up to 23 isoforms jointly across ∼900 individuals within PsychENCODE. We find a significant proportion of isoforms to be under genetic control (17,721 of 93,293 isoforms) with substantial shared genetic effects among local (orcis-) relative to distal (ortrans-) genetic variants (medianrg,cisandrg,trans= 0.31 and 0.06). Importantly, we find that 11.6% of brain-expressed genes (2,900 genes) are heritable only at the isoform-level. Integrating these isoform-specific genetic signals with psychiatric GWAS signals uncovers previously hidden psychiatric disease mechanisms. Specifically, we highlight reduced expression of a specificXRN2isoform as the underlying driver of the strongest GWAS signal for autism spectrum disorder. Overall, our method for fitting multivariate variance components models is flexible, widely applicable, and is implemented in the Julia programming language and available online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.