Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly ‘housekeeping’, whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research.
SignificanceDespite concerted efforts to identify causal genes that drive breast cancer (BC) initiation and progression, we have yet to establish robust signatures to stratify patient risk. Here we used in vivo transposon-based forward genetic screening to identify potentially relevant BC driver genes. Integrating this approach with survival prediction analysis, we identified six gene pairs that could prognose human BC subtypes into high-, intermediate-, and low-risk groups with high confidence and reproducibility. Furthermore, we identified susceptibility gene sets for basal and claudin-low subtypes (21 and 16 genes, respectively) that stratify patients into three relative risk subgroups. These signatures offer valuable prognostic insight into the genetic basis of BC and allow further exploration of the interconnectedness of BC driver genes during disease progression.
Background: Heart failure (HF) is the most common long-term complication of acute myocardial infarction (MI). Understanding plasma proteins associated with post-MI HF and their gene expression may identify new candidates for biomarker and drug target discovery. Methods: We employed aptamer-based affinity-capture plasma proteomics to measure 1305 plasma proteins at one month post-MI in a New Zealand cohort (CDCS) including 181 post-MI patients who were subsequently hospitalized for HF compared with 250 post-MI patients who remained event-free over a median follow-up of 4.9 years. We then correlated plasma proteins with left ventricular ejection fraction measured at 4 months post-MI and identified proteins potentially co-regulated in post-MI HF using Weighted Gene Co-expression Network Analysis (WCGNA). A Singapore cohort (IMMACULATE) of 223 post-MI patients, of which 33 patients were hospitalized for HF (median follow-up 2.0 years), was used for further candidate enrichment of plasma proteins using Fisher meta-analysis, resampling-based statistical testing and machine learning. We then cross-referenced differentially-expressed proteins with their differentially-expressed genes from single-cell transcriptomes of non-myocyte cardiac cells isolated from a murine MI model, and single-cell and single-nuclei transcriptomes of cardiac myocytes from murine HF models and human HF patients. Results: In the CDCS cohort, 212 differentially-expressed plasma proteins were significantly associated with subsequent HF events. Of these, 96 correlated with left ventricular ejection fraction measured at 4 months post-MI. WCGNA prioritised 63 of the 212 proteins that demonstrated significantly higher correlations among patients who developed post-MI HF compared with event-free controls (dataset 1). Cross-cohort meta-analysis of the IMMACULATE cohort identified 36 plasma proteins associated with post-MI HF (dataset 2) while single-cell transcriptomes identified 15 gene-protein candidates (dataset 3). The majority of prioritized proteins were of matricellular origin. The 6 most highly-enriched proteins that were common to all 3 datasets included well-established biomarkers of post-MI HF - N-terminal B-type natriuretic peptide and troponin T - as well as newly-emergent biomarkers - angiopoietin-2, thrombospondin-2, latent transforming growth factor-β binding protein-4 and follistatin-related protein-3. Conclusions: Large-scale human plasma proteomics, cross-referenced to unbiased cardiac cell transcriptomics at single-cell resolution, prioritized protein candidates associated with post-MI HF for further mechanistic and clinical validation.
Background: The human genome folds in 3 dimensions to form thousands of chromatin loops inside the nucleus, encasing genes and cis -regulatory elements for accurate gene expression control. Physical tethers of loops are anchored by the DNA-binding protein CTCF and the cohesin ring complex. Because heart failure is characterized by hallmark gene expression changes, it was recently reported that substantial CTCF-related chromatin reorganization underpins the myocardial stress–gene response, paralleled by chromatin domain boundary changes observed in CTCF knockout. Methods: We undertook an independent and orthogonal analysis of chromatin organization with mouse pressure-overload model of myocardial stress (transverse aortic constriction) and cardiomyocyte-specific knockout of Ctcf . We also downloaded published data sets of similar cardiac mouse models and subjected them to independent reanalysis. Results: We found that the cardiomyocyte chromatin architecture remains broadly stable in transverse aortic constriction hearts, whereas Ctcf knockout resulted in ≈99% abolition of global chromatin loops. Disease gene expression changes correlated instead with differential histone H3K27-acetylation enrichment at their respective proximal and distal interacting genomic enhancers confined within these static chromatin structures. Moreover, coregulated genes were mapped out as interconnected gene sets on the basis of their multigene 3D interactions. Conclusions: This work reveals a more stable genome-wide chromatin framework than previously described. Myocardial stress–gene transcription responds instead through H3K27-acetylation enhancer enrichment dynamics and gene networks of coregulation. Robust and intact CTCF looping is required for the induction of a rapid and accurate stress response.
Single-cell transcriptomic profiling is a powerful tool to explore cellular heterogeneity. However, most of these methods focus on the 3′-end of polyadenylated transcripts and provide only a partial view of the transcriptome. We introduce C1 CAGE, a method for the detection of transcript 5′-ends with an original sample multiplexing strategy in the C1TM microfluidic system. We first quantifiy the performance of C1 CAGE and find it as accurate and sensitive as other methods in the C1 system. We then use it to profile promoter and enhancer activities in the cellular response to TGF-β of lung cancer cells and discover subpopulations of cells differing in their response. We also describe enhancer RNA dynamics revealing transcriptional bursts in subsets of cells with transcripts arising from either strand in a mutually exclusive manner, validated using single molecule fluorescence in situ hybridization.
26Single-cell transcriptomic profiling is a powerful tool to explore cellular heterogeneity. However, 27 most of these methods focus on the 3'-end of polyadenylated transcripts and provide only a 28 partial view of the transcriptome. We introduce C1 CAGE, a method for the detection of 29 transcript 5'-ends with an original sample multiplexing strategy in the C1 TM microfluidic system. 30We first quantified the performance of C1 CAGE and found it as accurate and sensitive as other 31 methods in C1 system. We then used it to profile promoter and enhancer activities in the cellular 32 response to TGF-β of lung cancer cells and discovered subpopulations of cells differing in their 33 response. We also describe enhancer RNA dynamics revealing transcriptional bursts in subsets 34 of cells with transcripts arising from either strand within a single-cell in a mutually exclusive 35 manner, which was validated using single molecule fluorescence in-situ hybridization. 37Single-cell transcriptomic profiling can be used to uncover the dynamics of cellular states and 38 gene regulatory networks within a cell population (Trapnell, 2015; Wagner, Regev and Yosef, 39 2016). Most available single-cell methods capture the 3'-end of transcripts and are unable to 40 identify where transcription initiates. Instead, capturing the 5'-end of transcripts allows the 41 identification of transcription start sites (TSS) and thus the inference of the activities of their 42 regulatory elements. Cap analysis gene expression (CAGE), which captures the 5'-end of 43 transcripts, is a powerful tool to identify TSS at single nucleotide resolution (Shiraki et al., 2003; 44 Carninci et al., 2006). Using this technique, the FANTOM consortium has built an atlas of TSS 45 across major human cell-types and tissues (Forrest et al., 2014), analysis of which has led to the 46 identification of promoters as well as enhancers in the human genome (Andersson et al., 2014; 47 Hon et al., 2017). Enhancers have been implicated in a variety of biological processes (Lam et 48 al., 2014; Li, Notani and Rosenfeld, 2016), including the initial activation of responses to 49 stimuli (Arner et al., 2015) and chromatin remodeling for transcriptional activation (Mousavi et al., 50 2013). In addition, over 60% of the fine-mapped causal noncoding variants in autoimmune 51 disease lay within immune-cell enhancers (Farh et al., 2015), suggesting the relevance of 52 enhancers in pathogenesis of complex diseases. Enhancers have been identified by the 53 presence of balanced bidirectional transcription producing enhancer RNAs (eRNAs), which are 54 generally short, unstable and non-polyadenylated (non-polyA) (Andersson et al., 2014). Single 55 molecule fluorescence in situ hybridization (smFISH) studies have suggested that eRNAs are 56 induced with similar kinetics to their target mRNAs but that co-expression at individual alleles 57 was infrequent (Rahman et al., 2016). However, the majority of enhancer studies have been 58 conducted using bulk populations of cells...
BackgroundA sense-antisense gene pair (SAGP) is a gene pair where two oppositely transcribed genes share a common nucleotide sequence region. In eukaryotic genomes, SAGPs can be organized in complex sense-antisense architectures (CSAGAs) in which at least one sense gene shares loci with two or more antisense partners. As shown in several case studies, SAGPs may be involved in cancers, neurological diseases and complex syndromes. However, CSAGAs have not yet been characterized in the context of human disease or cancer.ResultsWe characterize five genes (TMEM97, IFT20, TNFAIP1, POLDIP2 and TMEM199) organized in a CSAGA on 17q11.2 (we term this the TNFAIP1/POLDIP2 CSAGA) and demonstrate their strong and reproducible co-regulatory transcription pattern in breast cancer tumours. Genes of the TNFAIP1/POLDIP2 CSAGA are located inside the smallest region of recurrent amplification on 17q11.2 and their expression profile correlates with the DNA copy number of the region. Survival analysis of a group of 410 breast cancer patients revealed significant survival-associated individual genes and gene pairs in the TNFAIP1/POLDIP2 CSAGA. Moreover, several of the gene pairs associated with survival, demonstrated synergistic effects. Expression of genes-members of the TNFAIP1/POLDIP2 CSAGA also strongly correlated with expression of genes of ERBB2 core region of recurrent amplification on 17q12. We clearly demonstrate that the observed co-regulatory transcription profile of the TNFAIP1/POLDIP2 CSAGA is maintained not only by a DNA amplification mechanism, but also by chromatin remodelling and local transcription activation.ConclusionWe have identified a novel TNFAIP1/POLDIP2 CSAGA and characterized its co-regulatory transcription profile in cancerous breast tissues. We suggest that the TNFAIP1/POLDIP2 CSAGA represents a clinically significant transcriptional structural-functional gene module associated with amplification of the genomic region on 17q11.2 and correlated with expression ERBB2 amplicon core genes in breast cancer. Co-expression pattern of this module correlates with histological grades and a poor prognosis in breast cancer when over-expressed. TNFAIP1/POLDIP2 CSAGA maps the risks of breast cancer relapse onto the complex genomic locus on 17q11.2.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.