Matching cell lines with cancer type and subtype of origin via mutational, epigenomic, and transcriptomic patterns

Salvadores, Marina; Fuster‐Tormo, Francisco; Supek, Fran

doi:10.1126/sciadv.aba1862

Cited by 62 publications

(74 citation statements)

References 55 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Consequently, cell lines originating from such tumors might have a different tissue type identification from that specified at isolation, creating a cause of mislabeling. 36 Figure S23 shows a heatmap of the distinctive molecular features between different cancer cell lines.…”

Section: Resultsmentioning

confidence: 99%

pdCSM-cancer: Using Graph-Based Signatures to Identify Small Molecules with Anticancer Properties

Al‐Jarf

Sá

Pires

et al. 2021

J. Chem. Inf. Model.

View full text Add to dashboard Cite

The development of new, effective, and safe drugs to treat cancer remains a challenging and time-consuming task due to limited hit rates, restraining subsequent development efforts. Despite the impressive progress of quantitative structure–activity relationship and machine learning-based models that have been developed to predict molecule pharmacodynamics and bioactivity, they have had mixed success at identifying compounds with anticancer properties against multiple cell lines. Here, we have developed a novel predictive tool, pdCSM-cancer, which uses a graph-based signature representation of the chemical structure of a small molecule in order to accurately predict molecules likely to be active against one or multiple cancer cell lines. pdCSM-cancer represents the most comprehensive anticancer bioactivity prediction platform developed till date, comprising trained and validated models on experimental data of the growth inhibition concentration (GI50%) effects, including over 18,000 compounds, on 9 tumor types and 74 distinct cancer cell lines. Across 10-fold cross-validation, it achieved Pearson’s correlation coefficients of up to 0.74 and comparable performance of up to 0.67 across independent, non-redundant blind tests. Leveraging the insights from these cell line-specific models, we developed a generic predictive model to identify molecules active in at least 60 cell lines. Our final model achieved an area under the receiver operating characteristic curve (AUC) of up to 0.94 on 10-fold cross-validation and up to 0.94 on independent non-redundant blind tests, outperforming alternative approaches. We believe that our predictive tool will provide a valuable resource to optimizing and enriching screening libraries for the identification of effective and safe anticancer molecules. To provide a simple and integrated platform to rapidly screen for potential biologically active molecules with favorable anticancer properties, we made pdCSM-cancer freely available online at .

show abstract

Section: Resultsmentioning

confidence: 99%

pdCSM-cancer: Using Graph-Based Signatures to Identify Small Molecules with Anticancer Properties

Al‐Jarf

Sá

Pires

et al. 2021

J. Chem. Inf. Model.

View full text Add to dashboard Cite

show abstract

“…When systematic in vivo screens become available, the proposed model can be a stepping stone toward an accurate prediction for tumor dependencies. Furthermore, methods that align genomic profiles between CCLs and tumors ( 40 , 41 ) may help to reduce the differences between tumor and CCL domains. We expect that the incorporation of these methods will improve the translational capability of DeepDEP along with the expansion of CCLs being screened by the DepMap projects.…”

Section: Discussionmentioning

confidence: 99%

Predicting and characterizing a cancer dependency map of tumors with deep learning

Chiu¹,

Zheng²,

Wang³

et al. 2021

Sci. Adv.

View full text Add to dashboard Cite

Genome-wide loss-of-function screens have revealed genes essential for cancer cell proliferation, called cancer dependencies. It remains challenging to link cancer dependencies to the molecular compositions of cancer cells or to unscreened cell lines and further to tumors. Here, we present DeepDEP, a deep learning model that predicts cancer dependencies using integrative genomic profiles. It uses a unique unsupervised pretraining that captures unlabeled tumor genomic representations to improve the learning of cancer dependencies. We demonstrated DeepDEP's improvement over conventional machine learning methods and validated the performance with three independent datasets. By systematic model interpretations, we extended the current dependency maps with functional characterizations of dependencies and a proof-of-concept in silico assay of synthetic essentiality. We applied DeepDEP to pan-cancer tumor genomics and built the first pan-cancer synthetic dependency map of 8000 tumors with clinical relevance. In summary, DeepDEP is a novel tool for investigating cancer dependency with rapidly growing genomic resources.

show abstract

“…To aid future investigations and navigate the heterogeneity of TERT transcriptomes, we attempted to find suitable cancer cell lines for primary tumor types. While our clustering was based solely on TERT expression, other groups have done similar associations using the whole “omic” data [ 67 , 99 ]. Particularly, Yu et al utilized the whole transcriptome to identify a comprehensive panel (TCGA-110-CL) of cell lines for 22 tumor types [ 67 ].…”

Section: Discussionmentioning

confidence: 99%

Analysis of TERT Isoforms across TCGA, GTEx and CCLE Datasets

Subasri

Shooshtari

Watson

et al. 2021

Cancers

View full text Add to dashboard Cite

Reactivation of the multi-subunit ribonucleoprotein telomerase is the primary telomere maintenance mechanism in cancer, but it is rate-limited by the enzymatic component, telomerase reverse transcriptase (TERT). While regulatory in nature, TERT alternative splice variant/isoform regulation and functions are not fully elucidated and are further complicated by their highly diverse expression and nature. Our primary objective was to characterize TERT isoform expression across 7887 neoplastic and 2099 normal tissue samples using The Cancer Genome Atlas (TCGA) and the Genotype-Tissue Expression Project (GTEx), respectively. We confirmed the global overexpression and splicing shift towards full-length TERT in neoplastic tissue. Stratifying by tissue type we found uncharacteristic TERT expression in normal brain tissue subtypes. Stratifying by tumor-specific subtypes, we detailed TERT expression differences potentially regulated by subtype-specific molecular characteristics. Focusing on β-deletion splicing regulation, we found the NOVA1 trans-acting factor to mediate alternative splicing in a cancer-dependent manner. Of relevance to future tissue-specific studies, we clustered cancer cell lines with tumors from related origin based on TERT isoform expression patterns. Taken together, our work has reinforced the need for tissue and tumour-specific TERT investigations, provided avenues to do so, and brought to light the current technical limitations of bioinformatic analyses of TERT isoform expression.

show abstract

Matching cell lines with cancer type and subtype of origin via mutational, epigenomic, and transcriptomic patterns

Cited by 62 publications

References 55 publications

pdCSM-cancer: Using Graph-Based Signatures to Identify Small Molecules with Anticancer Properties

pdCSM-cancer: Using Graph-Based Signatures to Identify Small Molecules with Anticancer Properties

Predicting and characterizing a cancer dependency map of tumors with deep learning

Analysis of TERT Isoforms across TCGA, GTEx and CCLE Datasets

Contact Info

Product

Resources

About