2020
DOI: 10.21203/rs.3.rs-84771/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Pushing the limits of solubility prediction via quality-oriented data selection

Abstract: Accurate prediction of the solubility of chemical substances in solvents remains a challenge. The sparsity of high-quality solubility data is recognized as the biggest hurdle in the development of robust data-driven methods for practical use. Nonetheless, the effects of the quality and quantity of data on aqueous solubility predictions have not yet been scrutinized. In this study, the roles of the size and the quality of datasets on the performances of the solubility prediction models are unraveled, and the co… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(4 citation statements)
references
References 36 publications
0
4
0
Order By: Relevance
“…The AqSolPred model, which was used for solubility predictions in the current work, had previously been validated on a benchmark solubility dataset 24 . The model has a Mean Absolute Error of 0.348 LogS, which is lower than the conventional cheminformatics and ML methods that are ordinarily used for the prediction of aqueous solubility of chemical species 13 .…”
Section: Validation Of Solubility Predictionsmentioning
confidence: 89%
See 3 more Smart Citations
“…The AqSolPred model, which was used for solubility predictions in the current work, had previously been validated on a benchmark solubility dataset 24 . The model has a Mean Absolute Error of 0.348 LogS, which is lower than the conventional cheminformatics and ML methods that are ordinarily used for the prediction of aqueous solubility of chemical species 13 .…”
Section: Validation Of Solubility Predictionsmentioning
confidence: 89%
“…It is an exemplary resource on quinone and aza-aromatic electroactive compounds as it contains several candidate molecules for batteries that are worthy of experimental investigation. The database contains comprehensive data that has been systematically collected by using the state-of-the-art computational procedures 1112 and data-driven methods 13 . Therefore, it's also useful for other applications beyond ARFBs for which the intriguing chemistry of these molecules matter.…”
Section: Background and Summarymentioning
confidence: 99%
See 2 more Smart Citations