2022
DOI: 10.26434/chemrxiv-2021-djd3d-v2
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Group Contribution and Machine Learning Approaches to Predict Abraham Solute Parameters, Solvation Free Energy, and Solvation Enthalpy

Abstract: We present a group contribution method (SoluteGC) and a machine learning model (SoluteML) to predict the Abraham solute parameters, as well as a machine learning model (DirectML) to predict solvation free energy and enthalpy at 298 K. The proposed group contribution method uses atom-centered functional groups with corrections for ring and polycyclic strain whilst the machine learning models adopt a directed message passing neural network. The solute parameters predicted from SoluteGC and SoluteML are used to c… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
23
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
6
1
1

Relationship

5
3

Authors

Journals

citations
Cited by 11 publications
(24 citation statements)
references
References 72 publications
1
23
0
Order By: Relevance
“…41,61 Finally, RMG's condensed-phase modeling capabilities were recently enhanced by incorporating corrections for diffusion-limited kinetics 62,63 and solvation energy corrections. 64,65 In the previous publication, 33 we presented the detailed kinetic model of the AIBN/H 2 O/CH 3 OH stress testing system itself (i.e., the "soup"), which was generated by APIOxy. Building upon our model of the AIBN soup, the following imipramine degradation model was generated for conditions resembling forced degradation experiments (Section 4): an initial mixture of 27.8 M water, 12.4 M methanol, 4.9 mM AIBN, 1.89 mM imipramine, 0.48 mM inert N 2 , and 0.27 mM O 2 at 40 °C at pH 6 and 11.…”
Section: Methodsmentioning
confidence: 99%
“…41,61 Finally, RMG's condensed-phase modeling capabilities were recently enhanced by incorporating corrections for diffusion-limited kinetics 62,63 and solvation energy corrections. 64,65 In the previous publication, 33 we presented the detailed kinetic model of the AIBN/H 2 O/CH 3 OH stress testing system itself (i.e., the "soup"), which was generated by APIOxy. Building upon our model of the AIBN soup, the following imipramine degradation model was generated for conditions resembling forced degradation experiments (Section 4): an initial mixture of 27.8 M water, 12.4 M methanol, 4.9 mM AIBN, 1.89 mM imipramine, 0.48 mM inert N 2 , and 0.27 mM O 2 at 40 °C at pH 6 and 11.…”
Section: Methodsmentioning
confidence: 99%
“…Other properties (ΔH sub,298 K , C p,g , and C p,s ) can be calculated using recently published correlations by Abraham and Acree 29,30 combined with machine learning predictions of solute parameters. 31 â–  RESULTS AND DISCUSSION SolProp Data Collection. In order to assess the accuracy and robustness of the predictions of the new methods and also to train some of the machine learning submodels we employed, extensive quantum chemical and experimental data sets were constructed or compiled in this work.…”
Section: S Dtmentioning
confidence: 99%
“…Aqueous Solid Solubility at 298 K. The performance of the aqueous solid solubility model is tested against randomly selected experimental data and against a separate test set with lower experimental uncertainty (579 data points). To model the performance against the test set with lower experimental 31 for small molecules; however, it is expected to be more robust for solutes with a higher molar mass because of the employed transfer learning method. 22 Performance of the New Models for Predicting Solid Solubility in Organic Solvents.…”
Section: S Dtmentioning
confidence: 99%
“…To numerically illustrate the application of SML towards such use cases, we have relied on the aforementioned specific example product (cyclopropylmethyl2-(2-oxo-3,4-dihydro-2H-1,3-benzoxazin-3-yl)acetate), as well as 999 other boutique molecules (see data availability section for a complete list) with lead times of 3 to 4 weeks on average. As a relevant property, we have selected calculated free energy of solvation estimates 39,40 as a proxy to the measurement. As to be expected from our discussions in the preceding sections, resulting learning curves suggest that on average SML requires a magnitude less data than RML (Fig.…”
Section: Decision Making In Chemical Synthesis Managementmentioning
confidence: 99%