Over the past decade, pharmaceutical companies have seen a decline in the number of drug candidates successfully passing through clinical trials, though billions are still spent on drug development. Poor aqueous solubility leads to low bio-availability, reducing pharmaceutical effectiveness. The human cost of inefficient drug candidate testing is of great medical concern, with fewer drugs making it to the production line, slowing the development of new treatments. In biochemistry and biophysics, water mediated reactions and interactions within active sites and protein pockets are an active area of research, in which methods for modelling solvated systems are continually pushed to their limits. Here, we discuss a multitude of methods aimed towards solvent modelling and solubility prediction, aiming to inform the reader of the options available, and outlining the various advantages and disadvantages of each approach.
In recent literature, some authors claim to have successfully applied density functional theory (DFT) methods to the attractive interaction between rare-gas atoms. In this note, we make a critical survey of these works and come to the conclusion that, in contrast to the claims made, state-of-the-art DFT methods are incapable of accounting for dispersion effects in a quantitative way.
Ab initio semiglobal potential energy and dipole moment hypersurfaces for the isomerising HCN-HNC system are computed, using a grid of 242 points, principally at the all-electron cc-pCVQZ CCSD͑T͒ level. Several potential energy hypersurfaces ͑PES͒ are presented including one which simultaneously fits 1527 points from earlier ab initio, smaller basis CCSD͑T͒ calculations of Bowman et al. ͓J. Chem. Phys. 99, 308 ͑1993͔͒. The resulting potential is then morphed with 17 aug-cc-pCVQZ CCSD͑T͒ points calculated at HNC geometries to improve the representation of the HNC part of the surface. The PES is further adjusted to coincide with three ab initio points calculated, at the cc-pCV5Z CCSD͑T͒ level, at the critical points of the system. The final PES includes relativistic and adiabatic corrections. Vibrational band origins for HCN and HNC with energy up to 12 400 cm Ϫ1 above the HCN zero-point energy are calculated variationally with the new surfaces. Band transition dipoles for the fundamentals of HCN and HNC, and a few overtone and hot band transitions for HCN have been calculated with the new dipole surface, giving generally very good agreement with experiment. The rotational levels of ground and vibrationally excited states are reproduced to high accuracy.
We demonstrate that the intrinsic aqueous solubility of crystalline druglike molecules can be estimated with reasonable accuracy from sublimation free energies calculated using crystal lattice simulations and hydration free energies calculated using the 3D Reference Interaction Site Model (3D-RISM) of the Integral Equation Theory of Molecular Liquids (IET). The solubilities of 25 crystalline druglike molecules taken from different chemical classes are predicted by the model with a correlation coefficient of R = 0.85 and a root mean square error (RMSE) equal to 1.45 log10S units, which is significantly more accurate than results obtained using implicit continuum solvent models. The method is not directly parametrized against experimental solubility data, and it offers a full computational characterization of the thermodynamics of transfer of the drug molecule from crystal phase to gas phase to dilute aqueous solution.
We
present four models of solution free-energy prediction for druglike
molecules utilizing cheminformatics descriptors and theoretically
calculated thermodynamic values. We make predictions of solution free
energy using physics-based theory alone and using machine learning/quantitative
structure–property relationship (QSPR) models. We also develop
machine learning models where the theoretical energies and cheminformatics
descriptors are used as combined input. These models are used to predict
solvation free energy. While direct theoretical calculation does not
give accurate results in this approach, machine learning is able to
give predictions with a root mean squared error (RMSE) of ∼1.1
log S units in a 10-fold cross-validation for our
Drug-Like-Solubility-100 (DLS-100) dataset of 100 druglike molecules.
We find that a model built using energy terms from our theoretical
methodology as descriptors is marginally less predictive than one
built on Chemistry Development Kit (CDK) descriptors. Combining both
sets of descriptors allows a further but very modest improvement in
the predictions. However, in some cases, this is a statistically significant
enhancement. These results suggest that there is little complementarity
between the chemical information provided by these two sets of descriptors,
despite their different sources and methods of calculation. Our machine
learning models are also able to predict the well-known Solubility
Challenge dataset with an RMSE value of 0.9–1.0 log S units.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.