2007
DOI: 10.1007/s10822-007-9125-z
|View full text |Cite
|
Sign up to set email alerts
|

Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules

Abstract: We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approach… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
58
0

Year Published

2008
2008
2024
2024

Publication Types

Select...
7
2

Relationship

3
6

Authors

Journals

citations
Cited by 55 publications
(58 citation statements)
references
References 32 publications
(40 reference statements)
0
58
0
Order By: Relevance
“…Schröter et al, 71 for instance, state that predictions of aqueous solubility for molecules whose structure falls outside the DOA are generally poor. In the literature there are several methods to estimate the DOA of a QSAR model.…”
Section: Resultsmentioning
confidence: 99%
“…Schröter et al, 71 for instance, state that predictions of aqueous solubility for molecules whose structure falls outside the DOA are generally poor. In the literature there are several methods to estimate the DOA of a QSAR model.…”
Section: Resultsmentioning
confidence: 99%
“…On the other hand, more sophisticated QSPR models such as those using nonlinear statistical techniques have been proposed that rely solely on calculated molecular descriptors, without the need for the experimentally determined melting point (Palmer et al, 2007;Johnson and Zheng, 2006;Schroeter et al, 2007;Zhou et al, 2008;Huuskonen et al, 1998;Cheng and Merz, 2003;Wassvik et al, 2006;Hou et al, 2004;Bergström, 2005). Melting point is a measure of the crystal lattice energy that needs to be overcome during dissolution, hence the significance in solubility models.…”
Section: Introductionmentioning
confidence: 99%
“…evaluate several distance‐based measures to estimate the domain of applicability of QSAR models, as well as more sophisticated approaches. The applicability of Gaussian process models can be estimated using their built‐in estimate of predictive variance 23…”
Section: Related Workmentioning
confidence: 99%