Abstract:Despite considerable pan-cancer efforts, the link between genomics and transcriptomics in cancer remains relatively weak and mostly based on statistical rather than mechanistic principles. By performing integrative analysis of transcriptomic and mutational profiles on a sample-by-sample basis, via regulatory/signaling networks, we identified a repertoire of 407 Master-Regulator proteins responsible for canalizing the genetics of 20 TCGA cohorts into 112 transcriptionallydistinct tumor subtypes. Further analysi… Show more
“…found that expression and proteomics were more predictive than DNA even within tumor types [34] . We believe the discrepancy between these two studies can be explained by the strong curation applied by Iorio Some studies have suggested that the expression of gene sets, such as pathway activations [11] or inferred transcription factor activity [35,36] , are more robust and interpretable predictors than expression of individual genes. We found that single genes' expression data produced notably better results than gene set enrichment scores overall, despite having many more presumably irrelevant features.…”
Achieving precision oncology requires accurate identification of targetable cancer vulnerabilities in patients. Generally, genomic features are regarded as the state-of-the-art method for stratifying patients for targeted therapies. In this work, we conduct the first rigorous comparison of DNA-and expression-based predictive models for viability across five datasets encompassing chemical and genetic perturbations. We find that expression consistently outperforms DNA for predicting vulnerabilities, including many currently stratified by canonical DNA markers. Contrary to their perception in the literature, the most successful expression-based models depend on few features and are amenable to biological interpretation. This work points to the importance of exploring more comprehensive expression profiling in clinical settings.
“…found that expression and proteomics were more predictive than DNA even within tumor types [34] . We believe the discrepancy between these two studies can be explained by the strong curation applied by Iorio Some studies have suggested that the expression of gene sets, such as pathway activations [11] or inferred transcription factor activity [35,36] , are more robust and interpretable predictors than expression of individual genes. We found that single genes' expression data produced notably better results than gene set enrichment scores overall, despite having many more presumably irrelevant features.…”
Achieving precision oncology requires accurate identification of targetable cancer vulnerabilities in patients. Generally, genomic features are regarded as the state-of-the-art method for stratifying patients for targeted therapies. In this work, we conduct the first rigorous comparison of DNA-and expression-based predictive models for viability across five datasets encompassing chemical and genetic perturbations. We find that expression consistently outperforms DNA for predicting vulnerabilities, including many currently stratified by canonical DNA markers. Contrary to their perception in the literature, the most successful expression-based models depend on few features and are amenable to biological interpretation. This work points to the importance of exploring more comprehensive expression profiling in clinical settings.
“…Specifically, the VIPER distance between two samples is computed using the reciprocal (i.e., integration of both direct and reverse) enrichment analysis (Kruithof-de Julio et al, 2011) of the Tumor Checkpoint proteins (i.e., 25 most activated and 25 most inactivated) in one sample in proteins differentially activated in the second sample, as implemented by the viperSimilarity function in the VIPER package (Alvarez et al, 2016). Use of 50 proteins (defined as Tumor Checkpoint protein) for sample similarity analysis is based on recent results showing that, on average, across all TCGA cohorts, the top 50 most aberrantly differentially activated proteins (candidate Master Regulators) are sufficient to canalize the effect of >90% of somatic mutations, on a sample by sample basis (Paull et al, 2020). Optimal cluster number was then estimated based on the global similarity of all samples in a cluster (cluster membership strength)-as computed based on the conservation of differential protein activities across all samples in the cluster-and evaluated by an Area Under the Curve (AUC) metric (Till, 2001).…”
Section: Laser Capture Microdissection Data Set (Cumc-e)mentioning
Despite extensive efforts to characterize the transcriptional landscape of pancreatic ductal adenocarcinoma (PDA), reproducible assessment of subtypes with actionable dependencies remains challenging. Systematic, network-based analysis of regulatory protein activity stratified PDA tumours into novel functional subtypes that were highly conserved across multiple cohorts, including at the single cell level and in laser capture microdissected (LCM) samples. Identified subtypes were characterized by activation of master regulator proteins representing either gastrointestinal lineage markers or transcriptional effectors of morphogen pathways. Single cell analysis confirmed the existence of Lineage and Morphogenic states but also revealed a dominant population of more differentiated Oncogenic Precursor (OP) cells , present in all sampled patients, yet not apparent from bulk tumor analysis. Master regulators were validated by pooled, CRISPR/Cas9 screens, demonstrating both subtype-specific and universal dependencies. Conversely, ectopic expression of Lineage MRs, such as OVOL2, was sufficient to reprogram Morphogenic cells, thus providing a roadmap for the future targeting of patient-specific dependencies in PDA.
“…"We call them master regulators." In an analysis of around 10,000 TGCA samples published on the preprint sever bioRxiv, Califano and his colleagues identified 407 master regulators that convey the effects of nearly all mutations implicated in the cancer samples 1 . Because master regulators are rarely mutated, genomics is not a sure-fire way to identify them.…”
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.