The choice for a polyadenylation site determines the length of the 3′-untranslated region (3′-UTRs) of an mRNA. Inclusion or exclusion of regulatory sequences in the 3′-UTR may ultimately affect gene expression levels. Poly(A) binding protein nuclear 1 (PABPN1) is involved in polyadenylation of pre-mRNAs. An alanine repeat expansion in PABPN1 (exp-PABPN1) causes oculopharyngeal muscular dystrophy (OPMD). We hypothesized that previously observed disturbed gene expression patterns in OPMD muscles may have been the result of an effect of PABPN1 on alternative polyadenylation, influencing mRNA stability, localization and translation. A single molecule polyadenylation site sequencing method was developed to explore polyadenylation site usage on a genome-wide level in mice overexpressing exp-PABPN1. We identified 2012 transcripts with altered polyadenylation site usage. In the far majority, more proximal alternative polyadenylation sites were used, resulting in shorter 3′-UTRs. 3′-UTR shortening was generally associated with increased expression. Similar changes in polyadenylation site usage were observed after knockdown or overexpression of expanded but not wild-type PABPN1 in cultured myogenic cells. Our data indicate that PABPN1 is important for polyadenylation site selection and that reduced availability of functional PABPN1 in OPMD muscles results in use of alternative polyadenylation sites, leading to large-scale deregulation of gene expression.
BackgroundThe multifaceted control of gene expression requires tight coordination of regulatory mechanisms at transcriptional and post-transcriptional level. Here, we studied the interdependence of transcription initiation, splicing and polyadenylation events on single mRNA molecules by full-length mRNA sequencing.ResultsIn MCF-7 breast cancer cells, we find 2700 genes with interdependent alternative transcription initiation, splicing and polyadenylation events, both in proximal and distant parts of mRNA molecules, including examples of coupling between transcription start sites and polyadenylation sites. The analysis of three human primary tissues (brain, heart and liver) reveals similar patterns of interdependency between transcription initiation and mRNA processing events. We predict thousands of novel open reading frames from full-length mRNA sequences and obtained evidence for their translation by shotgun proteomics. The mapping database rescues 358 previously unassigned peptides and improves the assignment of others. By recognizing sample-specific amino-acid changes and novel splicing patterns, full-length mRNA sequencing improves proteogenomics analysis of MCF-7 cells.ConclusionsOur findings demonstrate that our understanding of transcriptome complexity is far from complete and provides a basis to reveal largely unresolved mechanisms that coordinate transcription initiation and mRNA processing.Electronic supplementary materialThe online version of this article (10.1186/s13059-018-1418-0) contains supplementary material, which is available to authorized users.
For genetically heterogeneous diseases a better understanding of how the underlying gene defects are functionally interconnected will be important for dissecting disease etiology. The Immunodeficiency, Centromeric instability, Facial anomalies (ICF) syndrome is a chromatin disorder characterized by mutations in DNMT3B, ZBTB24, CDCA7 or HELLS Here, we generated a Zbtb24 BTB domain deletion mouse and found that loss of functional Zbtb24 leads to early embryonic lethality. Transcriptome analysis identified Cdca7 as the top down-regulated gene in Zbtb24 homozygous mutant mESCs, which can be restored by ectopic ZBTB24 expression. We further demonstrate enrichment of ZBTB24 at the CDCA7 promoter suggesting that ZBTB24 can function as a transcription factor directly controlling Cdca7 expression. Finally, we show that this regulation is conserved between species and that CDCA7 levels are reduced in patients carrying ZBTB24 nonsense mutations. Together, our findings demonstrate convergence of the two ICF genes ZBTB24 and CDCA7 at the level of transcription.
The formation of skeletal muscles is associated with drastic changes in protein requirements known to be safeguarded by tight control of gene transcription and mRNA processing. The contribution of regulation of mRNA translation during myogenesis has not been studied so far. We monitored translation during myogenic differentiation of C2C12 myoblasts, using a simplified protocol for ribosome footprint profiling. Comparison of ribosome footprints to total RNA showed that gene expression is mostly regulated at the transcriptional level. However, a subset of transcripts, enriched for mRNAs encoding for ribosomal proteins, was regulated at the level of translation. Enrichment was also found for specific pathways known to regulate muscle biology. We developed a dedicated pipeline to identify translation initiation sites (TISs) and discovered 5333 unannotated TISs, providing a catalog of upstream and alternative open reading frames used during myogenesis. We identified 298 transcripts with a significant switch in TIS usage during myogenesis, which was not explained by alternative promoter usage, as profiled by DeepCAGE. Also these transcripts were enriched for ribosomal protein genes. This study demonstrates that differential mRNA translation controls protein expression of specific subsets of genes during myogenesis. Experimental protocols, analytical workflows, tools and data are available through public repositories (http://lumc.github.io/ribosome-profiling-analysis-framework/).
Since its introduction more than twenty years ago, intraportal allogeneic cadaveric islet transplantation has been shown to be a promising therapy for patients with Type I Diabetes (T1D). Despite its positive outcome, the impact of islet transplantation has been limited due to a number of confounding issues, including the limited availability of cadaveric islets, the typically lifelong dependence of immunosuppressive drugs, and the lack of coverage of transplant costs by health insurance companies in some countries. Despite improvements in the immunosuppressive regimen, the number of required islets remains high, with two or more donors per patient often needed. Insulin independence is typically achieved upon islet transplantation, but on average just 25% of patients do not require exogenous insulin injections five years after. For these reasons, implementation of islet transplantation has been restricted almost exclusively to patients with brittle T1D who cannot avoid hypoglycemic events despite optimized insulin therapy. To improve C-peptide levels in patients with both T1 and T2 Diabetes, numerous clinical trials have explored the efficacy of mesenchymal stem cells (MSCs), both as supporting cells to protect existing β cells, and as source for newly generated β cells. Transplantation of MSCs is found to be effective for T2D patients, but its efficacy in T1D is controversial, as the ability of MSCs to differentiate into functional β cells in vitro is poor, and transdifferentiation in vivo does not seem to occur. Instead, to address limitations related to supply, human embryonic stem cell (hESC)-derived β cells are being explored as surrogates for cadaveric islets. Transplantation of allogeneic hESC-derived insulin-producing organoids has recently entered Phase I and Phase II clinical trials. Stem cell replacement therapies overcome the barrier of finite availability, but they still face immune rejection. Immune protective strategies, including coupling hESC-derived insulin-producing organoids with macroencapsulation devices and microencapsulation technologies, are being tested to balance the necessity of immune protection with the need for vascularization. Here, we compare the diverse human stem cell approaches and outcomes of recently completed and ongoing clinical trials, and discuss innovative strategies developed to overcome the most significant challenges remaining for transplanting stem cell-derived β cells.
Many disease-associated variants affect gene expression levels (expression quantitative trait loci, eQTLs) and expression profiling using next generation sequencing (NGS) technology is a powerful way to detect these eQTLs. We analyzed 94 total blood samples from healthy volunteers with DeepSAGE to gain specific insight into how genetic variants affect the expression of genes and lengths of 3′-untranslated regions (3′-UTRs). We detected previously unknown cis-eQTL effects for GWAS hits in disease- and physiology-associated traits. Apart from cis-eQTLs that are typically easily identifiable using microarrays or RNA-sequencing, DeepSAGE also revealed many cis-eQTLs for antisense and other non-coding transcripts, often in genomic regions containing retrotransposon-derived elements. We also identified and confirmed SNPs that affect the usage of alternative polyadenylation sites, thereby potentially influencing the stability of messenger RNAs (mRNA). We then combined the power of RNA-sequencing with DeepSAGE by performing a meta-analysis of three datasets, leading to the identification of many more cis-eQTLs. Our results indicate that DeepSAGE data is useful for eQTL mapping of known and unknown transcripts, and for identifying SNPs that affect alternative polyadenylation. Because of the inherent differences between DeepSAGE and RNA-sequencing, our complementary, integrative approach leads to greater insight into the molecular consequences of many disease-associated variants.
Technological advances in the sequencing field support in-depth characterization of the transcriptome. Here, we review genome-wide RNA sequencing methods used to investigate specific aspects of gene expression and its regulation, from transcription to RNA processing and translation. We discuss tag-based methods for studying transcription, alternative initiation and polyadenylation events, shotgun methods for detection of alternative splicing, full-length RNA sequencing for the determination of complete transcript structures, and targeted methods for studying the process of transcription and translation. With the ensemble of technologies available, it is now possible to obtain a comprehensive view on transcriptome complexity and the regulation of transcript diversity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.