2015
DOI: 10.1093/bioinformatics/btv529
|View full text |Cite
|
Sign up to set email alerts
|

Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel

Abstract: Motivation: Recent large-scale omics initiatives have catalogued the somatic alterations of cancer cell line panels along with their pharmacological response to hundreds of compounds. In this study, we have explored these data to advance computational approaches that enable more effective and targeted use of current and future anticancer therapeutics.Results: We modelled the 50% growth inhibition bioassay end-point (GI50) of 17 142 compounds screened against 59 cancer cell lines from the NCI60 panel (941 831 d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
137
0
1

Year Published

2015
2015
2024
2024

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 112 publications
(140 citation statements)
references
References 53 publications
(76 reference statements)
2
137
0
1
Order By: Relevance
“…[30] and Kalliokoski et al [31] set the basis to estimate the maximuma chievable performance of in silico models trained on data issued from different laboratories, which is quantified through the maximum achievable R 2 values on atest or external set. [39] While cellular sensitivity data are of relevance in predictive modeling as aw hole, [33][34][35][40][41][42] and in the fields of cell-line sensitivity and toxicology modeling in particular, [40,41,[43][44][45][46][47] to date, no systematic study has evaluated the comparability of public in vitro cytotoxicity data on al arge scale. To address this shortage, we have implemented in Rapipeline for the automatic extraction and curation of cell-line sensitivity data from ChEMBL version 19.…”
Section: Introductionmentioning
confidence: 99%
“…[30] and Kalliokoski et al [31] set the basis to estimate the maximuma chievable performance of in silico models trained on data issued from different laboratories, which is quantified through the maximum achievable R 2 values on atest or external set. [39] While cellular sensitivity data are of relevance in predictive modeling as aw hole, [33][34][35][40][41][42] and in the fields of cell-line sensitivity and toxicology modeling in particular, [40,41,[43][44][45][46][47] to date, no systematic study has evaluated the comparability of public in vitro cytotoxicity data on al arge scale. To address this shortage, we have implemented in Rapipeline for the automatic extraction and curation of cell-line sensitivity data from ChEMBL version 19.…”
Section: Introductionmentioning
confidence: 99%
“…It is important to consider that correlation metrics depend on the range of the dependent variable, and hence one might obtain low errors in prediction (i.e., low RMSE values) and yet a low R 2 value if the dependent variable spans few bioactivity units (Alexander, Tropsha, & Winkler, 2015;Cortés-Ciriano et al, 2016). Determining whether a given model shows good generalization capabilities depends on the drug discovery stage in which it is applied.…”
Section: Understanding Resultsmentioning
confidence: 99%
“…We used the recommended values for RF hyperparameters (1000 for the number of trees and the square root of the number of considered features for m try ). We preferred this to tuning these hyperparameters for each training set, as RF tuning generally results in just marginal improvements at the cost of being much more computationally expensive 38,56,57 . As no model selection was carried out for this algorithm, standard LOOCV was performed to estimate the performance of RF using all the features (RF-all) on each data set (treatment-cancer type-molecular profile).…”
Section: Multi-gene Classifiers With Built-in Feature Selection (Fs)mentioning
confidence: 99%
“…Indeed, while typically only tens of tumours have their response to the drug available, the molecular profiles of these tumours may easily aggregate over 50,000 features. To face this challenge, ML algorithms with built-in FS such as Elastic Nets 32-36 , Ridge 33,36 , LASSO 33,34,36,37 or Random Forest (RF) 12,33,35,36,38,39 have been used to model pharmacogenomics data from in vitro cell lines. For instance, RF ignores those features irrelevant for predicting drug response and thus has been able to tackle to some extent this challenge.…”
Section: Introductionmentioning
confidence: 99%