Background The human genome contains “dark” gene regions that cannot be adequately assembled or aligned using standard short-read sequencing technologies, preventing researchers from identifying mutations within these gene regions that may be relevant to human disease. Here, we identify regions with few mappable reads that we call dark by depth, and others that have ambiguous alignment, called camouflaged. We assess how well long-read or linked-read technologies resolve these regions. Results Based on standard whole-genome Illumina sequencing data, we identify 36,794 dark regions in 6054 gene bodies from pathways important to human health, development, and reproduction. Of these gene bodies, 8.7% are completely dark and 35.2% are ≥ 5% dark. We identify dark regions that are present in protein-coding exons across 748 genes. Linked-read or long-read sequencing technologies from 10x Genomics, PacBio, and Oxford Nanopore Technologies reduce dark protein-coding regions to approximately 50.5%, 35.6%, and 9.6%, respectively. We present an algorithm to resolve most camouflaged regions and apply it to the Alzheimer’s Disease Sequencing Project. We rescue a rare ten-nucleotide frameshift deletion in CR1, a top Alzheimer’s disease gene, found in disease cases but not in controls. Conclusions While we could not formally assess the association of the CR1 frameshift mutation with Alzheimer’s disease due to insufficient sample-size, we believe it merits investigating in a larger cohort. There remain thousands of potentially important genomic regions overlooked by short-read sequencing that are largely resolved by long-read technologies. Electronic supplementary material The online version of this article (10.1186/s13059-019-1707-2) contains supplementary material, which is available to authorized users.
BackgroundRare coding variants ABI3_rs616338-T and PLCG2_rs72824905-G were identified as risk or protective factors, respectively, for Alzheimer’s disease (AD).MethodsWe tested the association of these variants with five neurodegenerative diseases in Caucasian case-control cohorts: 2742 AD, 231 progressive supranuclear palsy (PSP), 838 Parkinson’s disease (PD), 306 dementia with Lewy bodies (DLB) and 150 multiple system atrophy (MSA) vs. 3351 controls; and in an African-American AD case-control cohort (181 AD, 331 controls). 1479 AD and 1491 controls were non-overlapping with a prior report.ResultsUsing Fisher’s exact test, there was significant association of both ABI3_rs616338-T (OR = 1.41, p = 0.044) and PLCG2_rs72824905-G (OR = 0.56, p = 0.008) with AD. These OR estimates were maintained in the non-overlapping replication AD-control analysis, albeit at reduced significance (ABI3_rs616338-T OR = 1.44, p = 0.12; PLCG2_rs72824905-G OR = 0.66, p = 0.19). None of the other cohorts showed significant associations that were concordant with those for AD, although the DLB cohort had suggestive findings (Fisher’s test: ABI3_rs616338-T OR = 1.79, p = 0.097; PLCG2_rs72824905-G OR = 0.32, p = 0.124). PLCG2_rs72824905-G showed suggestive association with pathologically-confirmed MSA (OR = 2.39, p = 0.050) and PSP (OR = 1.97, p = 0.061), although in the opposite direction of that for AD. We assessed RNA sequencing data from 238 temporal cortex (TCX) and 224 cerebellum (CER) samples from AD, PSP and control patients and identified co-expression networks, enriched in microglial genes and immune response GO terms, and which harbor PLCG2 and/or ABI3. These networks had higher expression in AD, but not in PSP TCX, compared to controls. This expression association did not survive adjustment for brain cell type population changes.ConclusionsWe validated the associations previously reported with ABI3_rs616338-T and PLCG2_rs72824905-G in a Caucasian AD case-control cohort, and observed a similar direction of effect in DLB. Conversely, PLCG2_rs72824905-G showed suggestive associations with PSP and MSA in the opposite direction. We identified microglial gene-enriched co-expression networks with significantly higher levels in AD TCX, but not in PSP, a primary tauopathy. This co-expression network association appears to be driven by microglial cell population changes in a brain region affected by AD pathology. Although these findings require replication in larger cohorts, they suggest distinct effects of the microglial genes, ABI3 and PLCG2 in neurodegenerative diseases that harbor significant vs. low/no amyloid ß pathology.Electronic supplementary materialThe online version of this article (10.1186/s13024-018-0289-x) contains supplementary material, which is available to authorized users.
The availability of high-quality RNA-sequencing and genotyping data of post-mortem brain collections from consortia such as CommonMind Consortium (CMC) and the Accelerating Medicines Partnership for Alzheimer’s Disease (AMP-AD) Consortium enable the generation of a large-scale brain cis-eQTL meta-analysis. Here we generate cerebral cortical eQTL from 1433 samples available from four cohorts (identifying >4.1 million significant eQTL for >18,000 genes), as well as cerebellar eQTL from 261 samples (identifying 874,836 significant eQTL for >10,000 genes). We find substantially improved power in the meta-analysis over individual cohort analyses, particularly in comparison to the Genotype-Tissue Expression (GTEx) Project eQTL. Additionally, we observed differences in eQTL patterns between cerebral and cerebellar brain regions. We provide these brain eQTL as a resource for use by the research community. As a proof of principle for their utility, we apply a colocalization analysis to identify genes underlying the GWAS association peaks for schizophrenia and identify a potentially novel gene colocalization with lncRNA RP11-677M14.2 (posterior probability of colocalization 0.975).
Progressive supranuclear palsy (PSP) is a neurodegenerative parkinsonian disorder characterized by tau pathology in neurons and glial cells. Transcriptional regulation has been implicated as a potential mechanism in conferring disease risk and neuropathology for some PSP genetic risk variants. However, the role of transcriptional changes as potential drivers of distinct cell-specific tau lesions has not been explored. In this study, we integrated brain gene expression measurements, quantitative neuropathology traits and genome-wide genotypes from 268 autopsy-confirmed PSP patients to identify transcriptional associations with unique cell-specific tau pathologies. We provide individual transcript and transcriptional network associations for quantitative oligodendroglial (coiled bodies = CB), neuronal (neurofibrillary tangles = NFT), astrocytic (tufted astrocytes = TA) tau pathology, and tau threads and genomic annotations of these findings. We identified divergent patterns of transcriptional associations for the distinct tau lesions, with the neuronal and astrocytic neuropathologies being the most different. We determined that NFT are positively associated with a brain co-expression network enriched for synaptic and PSP candidate risk genes, whereas TA are positively associated with a microglial gene-enriched immune network. In contrast, TA is negatively associated with synaptic and NFT with immune system transcripts. Our findings have implications for the diverse molecular mechanisms that underlie cell-specific vulnerability and disease risk in PSP.Electronic supplementary materialThe online version of this article (10.1007/s00401-018-1900-5) contains supplementary material, which is available to authorized users.
Large-scale brain bulk-RNAseq studies identified molecular pathways implicated in Alzheimer’s disease (AD), however these findings can be confounded by cellular composition changes in bulk-tissue. To identify cell intrinsic gene expression alterations of individual cell types, we designed a bioinformatics pipeline and analyzed three AD and control bulk-RNAseq datasets of temporal and dorsolateral prefrontal cortex from 685 brain samples. We detected cell-proportion changes in AD brains that are robustly replicable across the three independently assessed cohorts. We applied three different algorithms including our in-house algorithm to identify cell intrinsic differentially expressed genes in individual cell types (CI-DEGs). We assessed the performance of all algorithms by comparison to single nucleus RNAseq data. We identified consensus CI-DEGs that are common to multiple brain regions. Despite significant overlap between consensus CI-DEGs and bulk-DEGs, many CI-DEGs were absent from bulk-DEGs. Consensus CI-DEGs and their enriched GO terms include genes and pathways previously implicated in AD or neurodegeneration, as well as novel ones. We demonstrated that the detection of CI-DEGs through computational deconvolution methods is promising and highlight remaining challenges. These findings provide novel insights into cell-intrinsic transcriptional changes of individual cell types in AD and may refine discovery and modeling of molecular targets that drive this complex disease.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.