2004
DOI: 10.1021/ci034243x
|View full text |Cite
|
Sign up to set email alerts
|

ESOL:  Estimating Aqueous Solubility Directly from Molecular Structure

Abstract: This paper describes a simple method for estimating the aqueous solubility (ESOL--Estimated SOLubility) of a compound directly from its structure. The model was derived from a set of 2874 measured solubilities using linear regression against nine molecular properties. The most significant parameter was calculated logP(octanol), followed by molecular weight, proportion of heavy atoms in aromatic systems, and number of rotatable bonds. The model performed consistently well across three validation sets, predictin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

4
543
0
1

Year Published

2006
2006
2023
2023

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 695 publications
(601 citation statements)
references
References 24 publications
4
543
0
1
Order By: Relevance
“…To empirically exemplify this point, using the inner approach in Lusci et al (2013) and the outer approach in Duvenaud et al (2015) on the benchmark solubility data set in Delaney (2004), we obtain almost identical RMSE (root mean square error) of 0.61 and 0.60 respectively, in line with the best results reported in the literature.…”
Section: Discussionsupporting
confidence: 82%
“…To empirically exemplify this point, using the inner approach in Lusci et al (2013) and the outer approach in Duvenaud et al (2015) on the benchmark solubility data set in Delaney (2004), we obtain almost identical RMSE (root mean square error) of 0.61 and 0.60 respectively, in line with the best results reported in the literature.…”
Section: Discussionsupporting
confidence: 82%
“…Recently, Delaney studied a much larger data set of 2874 compounds by using 9 simple descriptors that included calculated logP, molecular weight, aromatic proportion, non-carbon proportion, polar surface area, etc. 13 The performance of the model was listed as follows: n ) 2874, m ) 9, R 2 ) 0.69, UAE ) 0.75, RMSE ) 1.01. In another report, Votano and Parham constructed a set of models with topological structure indices as descriptors using a variety of data analysis methods.…”
Section: Introductionmentioning
confidence: 99%
“…For several decades, researchers have tried to predict solubility parameters by applying artificial neural networks (ANNs) [3][4][5][6], genetic algorithms (GAs) [6], multiple linear regressions [7], partial least squares (PLSs) [8,9], support vector machines (SVMs) [10,11], random forest (RF) models [12] and so on. However, there are not many previous works to directly compute solubility parameters from solvation free energy that is the fundamental physical variable determining the solvation process.…”
Section: Introductionmentioning
confidence: 99%