Alternative splicing is a very frequent phenomenon in the human transcriptome. There are four major types of alternative splicing: exon skipping, alternative 3 splice site, alternative 5 splice site, and intron retention. Here we present a large-scale analysis of intron retention in a set of 21,106 known human genes. We observed that 14.8% of these genes showed evidence of at least one intron retention event. Most of the events are located within the untranslated regions (UTRs) of human transcripts. For those retained introns interrupting the coding region, the GC content, codon usage, and the frequency of stop codons suggest that these sequences are under selection for coding potential. Furthermore, 26% of the introns within the coding region participate in the coding of a protein domain. A comparison with mouse shows that at least 22% of all informative examples of retained introns in human are also present in the mouse transcriptome. We discuss that the data we present suggest that a significant fraction of the observed events is not spurious and might reflect biological significance. The analyses also allowed us to generate a reliable set of intron retention events that can be used for the identification of splicing regulatory elements.
Cancer gene panels (CGPs) are already used in clinical practice to match tumor's genetic profile with available targeted therapies. We aimed to determine if CGPs could also be applied to estimate tumor mutational load and predict clinical benefit to PD-1 and CTLA-4 checkpoint blockade therapy. Whole-exome sequencing (WES) mutation data obtained from melanoma and non-small cell lung cancer (NSCLC) patients published by Snyder et al. 2014 and Rizvi et al. 2015, respectively, were used to select nonsynonymous somatic mutations occurring in genes included in the Foundation Medicine Panel (FM-CGP) and in our own Institutional Panel (HSL-CGP). CGP-mutational load was calculated for each patient using both panels and was associated with clinical outcomes as defined and reported in the original articles. Higher CGP-mutational load was observed in NSCLC patients presenting durable clinical benefit (DCB) to PD-1 blockade (FM-CGP P=0.03, HSL-CGP P=0.01). We also observed that 69% of patients with high CGP-mutational load experienced DCB to PD-1 blockade, as compared to 20% of patients with low CGP-mutational load (FM-CGP and HSL-CGP P=0.01). Noteworthy, predictive accuracy of CGP-mutational load for DCB was not statistically different from that estimated by WES sequencing (P=0.73). Moreover, a high CGP-mutational load was significantly associated with progression-free survival (PFS) in patients treated with PD-1 blockade (FM-CGP P=0.005, HR 0.27, 95% IC 0.105 to 0.669; HSL-CGP P=0.008, HR 0.29, 95% IC 0.116 to 0.719). Similar associations between CGP-mutational load and clinical benefit to CTLA-4 blockade were not observed. In summary, our data reveals that CGPs can be used to estimate mutational load and to predict clinical benefit to PD-1 blockade, with similar accuracy to that reported using WES.
Wilms tumour (WT) is an embryonal kidney neoplasia for which very few driver genes have been identified. Here we identify DROSHA mutations in 12% of WT samples (26/222) using whole-exome sequencing and targeted sequencing of 10 microRNA (miRNA)-processing genes. A recurrent mutation (E1147K) affecting a metal-binding residue of the RNase IIIb domain is detected in 81% of the DROSHA-mutated tumours. In addition, we identify non-recurrent mutations in other genes of this pathway (DGCR8, DICER1, XPO5 and TARBP2). By assessing the miRNA expression pattern of the DROSHA-E1147K-mutated tumours and cell lines expressing this mutation, we determine that this variant leads to a predominant downregulation of a subset of miRNAs. We confirm that the downregulation occurs exclusively in mature miRNAs and not in primary miRNA transcripts, suggesting that the DROSHA E1147K mutation affects processing of primary miRNAs. Our data underscore the pivotal role of the miRNA biogenesis pathway in WT tumorigenesis, particularly the major miRNA-processing gene DROSHA.
BackgroundmiRNAs are small, non-coding RNA molecules that mainly act as negative regulators of target gene messages. Due to their regulatory functions, they have lately been implicated in several diseases, including malignancies. Roughly half of known miRNA genes are located within previously annotated protein-coding regions ("intragenic miRNAs"). Although a role of intragenic miRNAs as negative feedback regulators has been speculated, to the best of our knowledge there have been no conclusive large-scale studies investigating the relationship between intragenic miRNAs and host genes and their pathways.ResultsmiRNA-containing host genes were three times longer, contained more introns and had longer 5' introns compared to a randomly sampled gene cohort. These results are consistent with the observation that more than 60% of intronic miRNAs are found within the first five 5' introns. Host gene 3'-untranslated regions (3'-UTRs) were 40% longer and contained significantly more adenylate/uridylate-rich elements (AREs) compared to a randomly sampled gene cohort. Coincidentally, recent literature suggests that several components of the miRNA biogenesis pathway are required for the rapid decay of mRNAs containing AREs. A high-confidence set of predicted mRNA targets of intragenic miRNAs also shared many of these features with the host genes. Approximately 20% of intragenic miRNAs were predicted to target their host mRNA transcript. Further, KEGG pathway analysis demonstrated that 22 of the 74 pathways in which host genes were associated showed significant overrepresentation of proteins encoded by the mRNA targets of associated intragenic miRNAs.ConclusionsOur findings suggest that both host genes and intragenic miRNA targets may potentially be subject to multiple layers of regulation. Tight regulatory control of these genes is likely critical for cellular homeostasis and absence of disease. To this end, we examined the potential for negative feedback loops between intragenic miRNAs, host genes, and miRNA target genes. We describe, how higher-order miRNA feedback on hosts' interactomes may at least in part explain correlation patterns observed between expression of host genes and intragenic miRNA targets in healthy and tumor tissue.
The era of whole-genome sequencing has revealed that gene copy-number changes caused by duplication and deletion events have important evolutionary, functional, and phenotypic consequences. Recent studies have therefore focused on revealing the extent of variation in copy-number within natural populations of humans and other species. These studies have found a large number of copy-number variants (CNVs) in humans, many of which have been shown to have clinical or evolutionary importance. For the most part, these studies have failed to detect an important class of gene copy-number polymorphism: gene duplications caused by retrotransposition, which result in a new intron-less copy of the parental gene being inserted into a random location in the genome. Here we describe a computational approach leveraging next-generation sequence data to detect gene copy-number variants caused by retrotransposition (retroCNVs), and we report the first genome-wide analysis of these variants in humans. We find that retroCNVs account for a substantial fraction of gene copy-number differences between any two individuals. Moreover, we show that these variants may often result in expressed chimeric transcripts, underscoring their potential for the evolution of novel gene functions. By locating the insertion sites of these duplicates, we are able to show that retroCNVs have had an important role in recent human adaptation, and we also uncover evidence that positive selection may currently be driving multiple retroCNVs toward fixation. Together these findings imply that retroCNVs are an especially important class of polymorphism, and that future studies of copy-number variation should search for these variants in order to illuminate their potential evolutionary and functional relevance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.