The pan-cancer analysis of whole genomes The expansion of whole-genome sequencing studies from individual ICGC and TCGA working groups presented the opportunity to undertake a meta-analysis of genomic features across tumour types. To achieve this, the PCAWG Consortium was established. A Technical Working Group implemented the informatics analyses by aggregating the raw sequencing data from different working groups that studied individual tumour types, aligning the sequences to the human genome and delivering a set of high-quality somatic mutation calls for downstream analysis (Extended Data Fig. 1). Given the recent meta-analysis
High throughput cDNA sequencing technologies have advanced our understanding of transcriptome complexity and regulation. However, these methods lose information contained in biological RNA because the copied reads are often short and because modifications are not retained. We address these limitations using a native poly(A) RNA sequencing strategy developed by Oxford Nanopore Technologies (ONT). Our study generated 9.9 million aligned sequence reads for the human cell line GM12878, using thirty MinION flow cells at six institutions. These native RNA reads had a median length of 771 bases, and a maximum aligned length of over 21,000 bases. Mitochondrial poly(A) reads provided an internal measure of read length quality. We combined these long nanopore reads with higher accuracy short-reads and annotated GM12878 promoter regions, to identify 33,984 plausible RNA isoforms. We describe strategies for assessing 3′ poly(A) tail length, base modifications, and transcript haplotypes.
While splicing changes caused by somatic mutations in SF3B1 are known, identifying fulllength isoform changes may better elucidate the functional consequences of these mutations. We report nanopore sequencing of full-length cDNA from CLL samples with and without SF3B1 mutation, as well as normal B cell samples, giving a total of 149 million pass reads. We present FLAIR (Full-Length Alternative Isoform analysis of RNA), a computational workflow to identify high-confidence transcripts, perform differential splicing event analysis, and differential isoform analysis. Using nanopore reads, we demonstrate differential 3' splice site changes associated with SF3B1 mutation, agreeing with previous studies. We also observe a strong downregulation of intron retention events associated with SF3B1 mutation. Full-length transcript analysis links multiple alternative splicing events together and allows for better estimates of the abundance of productive versus unproductive isoforms. Our work demonstrates the potential utility of nanopore sequencing for cancer and splicing research.
PCAWG working groups focused on unified analyses of copynumber variation 6 , structural variants 7,8 , germline variants 5 , mutational signatures 9 and identification of driver genes 8 , among others 5. Here, we report the joint analysis of available matched transcriptome and genome profiling for 1,188 samples from 27 tumour types by the PCAWG Transcriptome Working Group 5 , providing the largest, to our knowledge, resource of RNA phenotypes and their underlying genetic changes in cancer so far (Extended Data Fig. 1, Methods, Supplementary Results, Supplementary Table 23). We demonstrate the importance of transcriptomics data in understanding how different dimensions of specific DNA alterations contribute to carcinogenesis and map out the landscape of cancer-related RNA alterations.
SF3B1 is one of the most frequently mutated genes in chronic lymphocytic leukemia (CLL) and is associated with poor patient prognosis. While alternative splicing patterns caused by mutations in SF3B1 have been identified with short-read RNA sequencing, a critical barrier in understanding the functional consequences of these splicing changes is that we lack the full transcript context in which these changes are occurring. Using nanopore sequencing technology, we have resequenced full-length cDNA from CLL samples with and without the hotspot SF3B1 K700E mutation, and a normal B cell. We have developed a workflow called FLAIR (Full-Length Alternative Isoform analysis of RNA), leveraging the full-length transcript sequencing data that nanopore affords. We report results from nanopore sequencing that are concordant with known SF3B1 biology from short read sequencing as well as altered intron retention events more confidently observed using long reads. Splicing analysis of nanopore reads between the SF3B1 WT and SF3B1 K700E samples identifies alternative upstream 3' splice sites associated with SF3B1 K700E . We also find downregulation of intron retention events in SF3B1 K700E relative to SF3B1 WT and no difference between CLL SF3B1 MT and B cell, suggesting an aberrant intron retention landscape in CLL samples lacking SF3B1 mutation. With full-length isoforms, we are able to better estimate the abundance of RNA transcripts that are productive and will likely be translated versus those that are unproductive. Validation from short-read data also reveals a strong branch point sequence in these downregulated intron retention events, consistent with previously reported branch points associated with mutated SF3B1. As nanopore sequencing has yet to become a routine tool for characterization of the transcriptome, our work demonstrates the potential utility of nanopore sequencing for cancer and splicing research.
Cancers require telomere maintenance mechanisms for unlimited replicative potential. They achieve this through TERT activation or alternative telomere lengthening associated with ATRX or DAXX loss. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, we dissect whole-genome sequencing data of over 2500 matched tumor-control samples from 36 different tumor types aggregated within the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium to characterize the genomic footprints of these mechanisms. While the telomere content of tumors with ATRX or DAXX mutations (ATRX/DAXX trunc) is increased, tumors with TERT modifications show a moderate decrease of telomere content. One quarter of all tumor samples contain somatic integrations of telomeric sequences into non-telomeric DNA. This fraction is increased to 80% prevalence in ATRX/DAXX trunc tumors, which carry an aberrant telomere variant repeat (TVR) distribution as another genomic marker. The latter feature includes enrichment or depletion of the previously undescribed singleton TVRs TTCGGG and TTTGGG, respectively. Our systematic analysis provides new insight into the recurrent genomic alterations associated with telomere maintenance mechanisms in cancer.
Many primary tumours have low levels of molecular oxygen (hypoxia), and hypoxic tumours respond poorly to therapy. Pan-cancer molecular hallmarks of tumour hypoxia remain poorly understood, with limited comprehension of its associations with specific mutational processes, non-coding driver genes and evolutionary features. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2658 cancers across 38 tumour types, we quantify hypoxia in 1188 tumours spanning 27 cancer types. Elevated hypoxia associates with increased mutational load across cancer types, irrespective of underlying mutational class. The proportion of mutations attributed to several mutational signatures of unknown aetiology directly associates with the level of hypoxia, suggesting underlying mutational processes for these signatures. At the gene level, driver mutations in TP53, MYC and PTEN are enriched in hypoxic tumours, and mutations in PTEN interact with hypoxia to direct tumour evolutionary trajectories. Overall, hypoxia plays a critical role in shaping the genomic and evolutionary landscapes of cancer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.