Identifying molecular cancer drivers is critical for precision oncology. Multiple advanced algorithms to identify drivers now exist, but systematic attempts to combine and optimize them on large datasets are few. We report a PanCancer and PanSoftware analysis spanning 9,423 tumor exomes (comprising all 33 of The Cancer Genome Atlas projects) and using 26 computational tools to catalog driver genes and mutations. We identify 299 driver genes with implications regarding their anatomical sites and cancer/cell types. Sequence- and structure-based analyses identified >3,400 putative missense driver mutations supported by multiple lines of evidence. Experimental validation confirmed 60%-85% of predicted mutations as likely drivers. We found that >300 MSI tumors are associated with high PD-1/PD-L1, and 57% of tumors analyzed harbor putative clinically actionable events. Our study represents the most comprehensive discovery of cancer genes and mutations to date and will serve as a blueprint for future biological and clinical endeavors.
Colorectal cancer (CRC) is the third most common cancer world-wide with 1.2 million patients diagnosed yearly. In late stage CRC, the most commonly used targeted therapies are monoclonal antibodies cetuximab and panitumumab, which inactivate EGFR1. Recent studies have identified alterations in KRAS2–4 and other genes5–13 as likely mechanisms of primary and secondary resistance to anti-EGFR antibody therapy. Despite these efforts, additional mechanisms of resistance to EGFR blockade are thought to be present in CRC and little is known about determinants of sensitivity to this therapy. To examine the effect of somatic genetic changes in CRC on response to anti-EGFR antibody therapy, we performed complete exome sequence and copy number analyses of 129 patient-derived tumorgrafts and targeted genomic analyses of 55 patient tumors, all of which were KRAS wild-type. We analyzed the response of tumors to anti-EGFR antibody blockade in tumorgraft models or in clinical settings. In addition to previously identified genes, we detected mutations in ERBB2, EGFR, FGFR1, PDGFRA, and MAP2K1 as potential mechanisms of primary resistance to this therapy. Novel alterations in the ectodomain of EGFR were identified in patients with acquired resistance to EGFR blockade. Amplifications and sequence changes in the tyrosine kinase receptor adaptor gene IRS2 were identified in tumors with increased sensitivity to anti-EGFR therapy. Therapeutic resistance to EGFR blockade could be overcome in tumorgraft models through combinatorial therapies targeting actionable genes. These analyses provide a systematic approach to evaluate response to targeted therapies in human cancer, highlight new mechanisms of responsiveness to anti-EGFR therapies, and provide new avenues for intervention in the management of CRC.
Sequencing has identified millions of somatic mutations in human cancers, but distinguishing cancer driver genes remains a major challenge. Numerous methods have been developed to identify driver genes, but evaluation of the performance of these methods is hindered by the lack of a gold standard, that is, bona fide driver gene mutations. Here, we establish an evaluation framework that can be applied to driver gene prediction methods. We used this framework to compare the performance of eight such methods. One of these methods, described here, incorporated a machinelearning-based ratiometric approach. We show that the driver genes predicted by each of the eight methods vary widely. Moreover, the P values reported by several of the methods were inconsistent with the uniform values expected, thus calling into question the assumptions that were used to generate them. Finally, we evaluated the potential effects of unexplained variability in mutation rates on false-positive driver gene predictions. Our analysis points to the strengths and weaknesses of each of the currently available methods and offers guidance for improving them in the future.cancer genomics | DNA sequencing | driver genes | cancer mutations | computational method evaluation
The Cancer Genome Atlas (TCGA) has catalyzed systematic characterization of diverse genomic alterations underlying human cancers. At this historic junction marking the completion of genomic characterization of over 11,000 tumors from 33 cancer types, we present our current understanding of the molecular processes governing oncogenesis. We illustrate our insights into cancer through synthesis of the findings of the TCGA PanCancer Atlas project on three facets of oncogenesis: (1) somatic driver mutations, germline pathogenic variants, and their interactions in the tumor; (2) the influence of the tumor genome and epigenome on transcriptome and proteome; and (3) the relationship between tumor and the microenvironment, including implications for drugs targeting driver events and immunotherapies. These results will anchor future characterization of rare and common tumor types, primary and relapsed tumors, and cancers across ancestry groups and will guide the deployment of clinical genomic sequencing.
Metastases are responsible for the majority of cancer-related deaths. Although genomic heterogeneity within primary tumors is associated with relapse, heterogeneity among treatment-naïve metastases has not been comprehensively assessed. We analyzed sequencing data for 76 untreated metastases from 20 patients and inferred cancer phylogenies for breast, colorectal, endometrial, gastric, lung, melanoma, pancreatic, and prostate cancers. We found that within individual patients, a large majority of driver gene mutations are common to all metastases. Further analysis revealed that the driver gene mutations that were not shared by all metastases are unlikely to have functional consequences. A mathematical model of tumor evolution and metastasis formation provides an explanation for the observed driver gene homogeneity. Thus, single biopsies capture most of the functionally important mutations in metastases and therefore provide essential information for therapeutic decision-making.
The functional impact of the vast majority of cancer somatic mutations remains unknown, representing a critical knowledge gap for implementing precision oncology. Here, we report the development of a moderate-throughput functional genomic platform consisting of efficient mutant generation, sensitive viability assays using two growth factor-dependent cell models, and functional proteomic profiling of signaling effects for select aberrations. We apply the platform to annotate >1,000 genomic aberrations, including gene amplifications, point mutations, indels, and gene fusions, potentially doubling the number of driver mutations characterized in clinically actionable genes. Further, the platform is sufficiently sensitive to identify weak drivers. Our data are accessible through a user-friendly, public data portal. Our study will facilitate biomarker discovery, prediction algorithm improvement, and drug development.
Sequencing has identified millions of somatic mutations in human cancers, but distinguishing cancer driver genes remains a major challenge. Numerous methods have been developed to identify driver genes, but evaluation of the performance of these methods is hindered by the lack of a gold standard, i.e., bona fide driver gene mutations.Here, we establish an evaluation framework that can be applied when a gold standard is not available. We used this framework to compare the performance of eight driver gene prediction methods. One of these methods, newly described here, incorporated a machine learning-based ratiometric approach. We show that the driver genes predicted by each of these eight methods vary widely. Moreover, the p-values reported by several of the methods were inconsistent with the uniform values expected, thus calling into question the assumptions that were used to generate them. Finally, we evaluated the potential effects of unexplained variability in mutation rates on false positive driver gene predictions. Our analysis points to the strengths and weaknesses of each of the currently available methods and offers guidance for improving them in the future. Tokheim et al., page 3Significance Modern large-scale sequencing of human cancers seeks to comprehensively discover mutated genes that confer a selective advantage to cancer cells. Key to this effort has been development of computational algorithms to find genes that drive cancer, based on their patterns of mutation in large patient cohorts. However, since there is no generally accepted gold standard of driver genes, it has been difficult to quantitatively compare these methods. We present a new machine learning method for driver gene prediction and a rigorous protocol to evaluate and compare prediction methods. Our results suggest that most current methods do not adequately account for heterogeneity in the number of mutations expected by chance and consequently have many false positive calls. The problem is most acute for cancers with high mutation rates and comprehensive discovery of drivers in these cancers may be more difficult than currently anticipated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.