Gastric cancer is a leading cause of cancer deaths, but analysis of its molecular and clinical characteristics has been complicated by histological and aetiological heterogeneity. Here we describe a comprehensive molecular evaluation of 295 primary gastric adenocarcinomas as part of The Cancer Genome Atlas (TCGA) project. We propose a molecular classification dividing gastric cancer into four subtypes: tumours positive for Epstein–Barr virus, which display recurrent PIK3CA mutations, extreme DNA hypermethylation, and amplification of JAK2, CD274 (also known as PD-L1) and PDCD1LG2 (also knownasPD-L2); microsatellite unstable tumours, which show elevated mutation rates, including mutations of genes encoding targetable oncogenic signalling proteins; genomically stable tumours, which are enriched for the diffuse histological variant and mutations of RHOA or fusions involving RHO-family GTPase-activating proteins; and tumours with chromosomal instability, which show marked aneuploidy and focal amplification of receptor tyrosine kinases. Identification of these subtypes provides a roadmap for patient stratification and trials of targeted therapies.
Thymic epithelial tumors (TETs) are one of the rarest adult malignancies. Among TETs, thymoma is the most predominant, characterized by a unique association with autoimmune diseases, followed by thymic carcinoma, which is less common but more clinically aggressive. Using multi-platform omics analyses on 117 TETs, we define four subtypes of these tumors defined by genomic hallmarks and an association with survival and World Health Organization histological subtype. We further demonstrate a marked prevalence of a thymoma-specific mutated oncogene, GTF2I, and explore its biological effects on multi-platform analysis. We further observe enrichment of mutations in HRAS, NRAS, and TP53. Last, we identify a molecular link between thymoma and the autoimmune disease myasthenia gravis, characterized by tumoral overexpression of muscle autoantigens, and increased aneuploidy.
Paired-end sequencing is a common approach for identifying structural variation (SV) in genomes. Discrepancies between the observed and expected alignments indicate potential SVs. Most SV detection algorithms use only one of the possible signals and ignore reads with multiple alignments. This results in reduced sensitivity to detect SVs, especially in repetitive regions. We introduce GASVPro, an algorithm combining both paired read and read depth signals into a probabilistic model that can analyze multiple alignments of reads. GASVPro outperforms existing methods with a 50 to 90% improvement in specificity on deletions and a 50% improvement on inversions. GASVPro is available at http://compbio.cs.brown.edu/software.
Serous epithelial ovarian cancer (EOC) patients often succumb to aggressive metastatic disease, yet little is known about the behavior and genetics of ovarian cancer metastasis. Here, we aim to understand how omental metastases differ from primary tumors and how these differences may influence chemotherapy. We analyzed the miRNA expression profiles of primary EOC tumors and their respective omental metastases from 9 patients using miRNA Taqman qPCR arrays. We find 17 miRNAs with differential expression in omental lesions compared to primary tumors. miR-21, miR-150, and miR-146a have low expression in most primary tumors with significantly increased expression in omental lesions, with concomitant decreased expression of predicted mRNA targets based on mRNA expression. We find that miR-150 and miR-146a mediate spheroid size. Both miR-146a and miR-150 increase the number of residual surviving cells by 2–4 fold when challenged with lethal cisplatin concentrations. These observations suggest that at least two of the miRNAs, miR-146a and miR-150, up-regulated in omental lesions, stimulate survival and increase drug tolerance. Our observations suggest that cancer cells in omental tumors express key miRNAs differently than primary tumors, and that at least some of these microRNAs may be critical regulators of the emergence of drug resistant disease.
Large-scale cancer sequencing efforts such as The Cancer Genome Atlas (TCGA) and others have shown that tumors exhibit extensive mutational heterogeneity with relatively few genes mutated at significant frequency and many genes mutated in only a small number of individuals. This long tail phenomenon complicates the identification of driver mutations by their observed frequency. The long tail is explained in part by the fact that driver mutations target genes in signaling and regulatory pathways, and these pathways may be perturbed by different mutations in different tumors. We developed two complementary algorithms, HotNet2 and Dendrix++, to analyze combinations of mutations in known or novel pathways. HotNet2 uses prior knowledge of pathways and protein complexes represented in a genome-scale protein-protein interaction network, and identifies significantly mutated subnetworks using a heat diffusion model. HotNet2 simultaneously assesses both the significance of mutations in individual proteins and the local topology of protein interactions. Dendrix++ identifies combinations of mutations de novo, without prior knowledge of pathways or protein interactions, by finding sets of mutations that are mutually exclusive across the tumor cohort. There are numerous examples of mutually exclusive mutations between interacting proteins; e.g. BRAF and KRAS in colorectal cancer. Dendrix++ generalizes this idea to find larger groups of mutually exclusive mutations. We applied HotNet2 and Dendrix++ to whole-exome sequencing and copy number aberration data from 3299 samples of twelve tumor types from TCGA Pan-Cancer project. Both algorithms identified gene sets that overlap well-known cancer pathways (e.g. TP53, MAPK, and RAS signaling pathways), as well as genes and complexes with less characterized roles in cancer (e.g. the cohesin and condensin complexes). HotNet2 subnetworks also contained novel candidate cancer genes that were rarely mutated in this cohort and thus not reported by single-gene tests of significance, including KDM5A, SHPRH, and ARID4A, each of which interacts with well-known cancer genes. Many of these gene sets have biological functions often perturbed in cancer, such as chromatin modification (e.g. the SWI/SNF and BAP1 complexes) and DNA damage repair (e.g. SHPRH). These results demonstrate the ability of HotNet2 and Dendrix++ to identify novel combinations of mutations in thousands of tumors, prioritizing genes and mutations in the long tail for further experimental studies. Citation Format: Mark D. Leiserson, Fabio Vandin, Hsin-Ta Wu, Jason R. Dobson, Benjamin R. Raphael. Pan-cancer identification of mutated pathways and protein complexes. [abstract]. In: Proceedings of the 105th Annual Meeting of the American Association for Cancer Research; 2014 Apr 5-9; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2014;74(19 Suppl):Abstract nr 5324. doi:10.1158/1538-7445.AM2014-5324
Structural variation, including large deletions, duplications, inversions, translocations, and other rearrangements, is common in human and cancer genomes. A number of methods have been developed to identify structural variants from Illumina short-read sequencing data. However, reliable identification of structural variants remains challenging because many variants have breakpoints in repetitive regions of the genome and thus are difficult to identify with short reads. The recently developed linked-read sequencing technology from 10X Genomics combines a novel barcoding strategy with Illumina sequencing. This technology labels all reads that originate from a small number (˜5-10) DNA molecules˜50Kbp in length with the same molecular barcode. These barcoded reads contain long-range sequence information that is advantageous for identification of structural variants. We present Novel Adjacency Identification with Barcoded Reads (NAIBR), an algorithm to identify structural variants in linked-read sequencing data. NAIBR predicts novel adjacencies in a individual genome resulting from structural variants using a probabilistic model that combines multiple signals in barcoded reads. We show that NAIBR outperforms several existing methods for structural variant identification -including two recent methods that also analyze linked-reads -on simulated sequencing data and 10X whole-genome sequencing data from the NA12878 human genome and the HCC1954 breast cancer cell line. Several of the novel somatic structural variants identified in HCC1954 overlap known cancer genes.
Motivation: Somatic copy number aberrations (SCNAs) are frequent in cancer genomes, but many of these are random, passenger events. A common strategy to distinguish functional aberrations from passengers is to identify those aberrations that are recurrent across multiple samples. However, the extensive variability in the length and position of SCNAs makes the problem of identifying recurrent aberrations notoriously difficult.Results: We introduce a combinatorial approach to the problem of identifying independent and recurrent SCNAs, focusing on the key challenging of separating the overlaps in aberrations across individuals into independent events. We derive independent and recurrent SCNAs as maximal cliques in an interval graph constructed from overlaps between aberrations. We efficiently enumerate all such cliques, and derive a dynamic programming algorithm to find an optimal selection of non-overlapping cliques, resulting in a very fast algorithm, which we call RAIG (Recurrent Aberrations from Interval Graphs). We show that RAIG outperforms other methods on simulated data and also performs well on data from three cancer types from The Cancer Genome Atlas (TCGA). In contrast to existing approaches that employ various heuristics to select independent aberrations, RAIG optimizes a well-defined objective function. We show that this allows RAIG to identify rare aberrations that are likely functional, but are obscured by overlaps with larger passenger aberrations.Availability: http://compbio.cs.brown.edu/software.Contact: braphael@brown.eduSupplementary information: Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.