2020
DOI: 10.5599/admet.766
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of aqueous intrinsic solubility of druglike molecules using Random Forest regression trained with Wiki-pS0 database

Abstract: <p class="ADMETkeywordsheading">The accurate prediction of solubility of drugs is still problematic. It was thought for a long time that shortfalls had been due the lack of high-quality solubility data from the chemical space of drugs. This study considers the quality of solubility data, particularly of ionizable drugs. A database is described, comprising 6355 entries of intrinsic solubility for 3014 different molecules, drawing on 1325 citations. In an earlier publication, many factors affecting… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

7
125
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
1
1

Relationship

3
4

Authors

Journals

citations
Cited by 45 publications
(138 citation statements)
references
References 158 publications
(181 reference statements)
7
125
1
Order By: Relevance
“…This is an important characteristic differentiating the two groups. In the small-molecule set, the NHA and NHD groups overlap considerably, as illustrated elsewhere [20]. But, in the big-molecule set (Fig.…”
Section: Physicochemical Properties Of the Big Moleculesmentioning
confidence: 81%
See 3 more Smart Citations
“…This is an important characteristic differentiating the two groups. In the small-molecule set, the NHA and NHD groups overlap considerably, as illustrated elsewhere [20]. But, in the big-molecule set (Fig.…”
Section: Physicochemical Properties Of the Big Moleculesmentioning
confidence: 81%
“…Of the new machine-learning statistical approaches, the Random Forest regression (RFR) method is thought to be one of the most accurate in predicting solubility [17][18][19][20]. RFR can be employed 'off the shelf,' requiring only minimal learning [19].…”
Section: Random Forest Regressionmentioning
confidence: 99%
See 2 more Smart Citations
“…Performing high-quality solubility measurements is a difficult task due to uncertainties in experimental procedures, as explained in detail in Ref. [18]. Additionally, unintentional misprints, such as the erroneous conversions of values or units while carrying them from one source to another, cause deterioration in the quality of data.…”
Section: The Quality Of Datamentioning
confidence: 99%