The external prediction capability of quantitative structure-activity relationship (QSAR) models is often quantified using the predictive squared correlation coefficient, q (2). This index relates the predictive residual sum of squares, PRESS, to the activity sum of squares, SS, without postprocessing of the model output, the latter of which is automatically done when calculating the conventional squared correlation coefficient, r (2). According to the current OECD guidelines, q (2) for external validation should be calculated with SS referring to the training set activity mean. Our present findings including a mathematical proof demonstrate that this approach yields a systematic overestimation of the prediction capability that is triggered by the difference between the training and test set activity means. Example calculations with three regression models and data sets taken from literature show further that for external test sets, q (2) based on the training set activity mean may become even larger than r (2). As a consequence, we suggest to always use the test set activity mean when quantifying the external prediction capability through q (2) and to revise the respective OECD guidance document accordingly. The discussion includes a comparison between r (2) and q (2) value ranges and the q (2) statistics for cross-validation.
Abstract-Environmental contaminants are frequently encountered as mixtures, and the behavior of chemicals in a mixture may not correspond to that predicted from data on the pure compounds. This paper reviews current quantitative structure-activity relationship (QSAR) methodology for the analysis of mixture toxicity. Interactions of components in a mixture can cause complex and substantial changes in the apparent properties of its constituents, resulting in synergistic or antagonistic effects as opposed to the ideal reference case of additive behavior: concentration addition (CA) and independent action (IA) are two prominent reference models for the evaluation of joint activity, and both have mechanistic support from pharmacology. After discussing graphical tools for analyzing binary mixtures and joint effect indices suitable also for multicomponent mixtures, water solubility and hydrophobicity of mixtures are analyzed with respect to the property contributions of the individual components. With the former, small but significant deviations from ideal behavior are observed even for simple organics, whereas in the case of low concentrations, mixture hydrophobicity was found to agree approximately with the fractional contributions of the components. A variety of studies suggest that mixtures of compounds exerting only one (narcotic or specific) mode of action can be modeled satisfactorily by assuming CA, whereas the interaction of differently acting compounds tends to yield a less than CA joint activity. The QSARs have been used to predict concentrations of components in mixtures from joint effects and defined mixture ratios and have been developed to predict narcotic-type mixture toxicity from molecular descriptors that are calculated as composite properties according to the fractional concentrations of the mixture components. In the case of ionogenic compounds, initial results suggest that CA may serve as a firstorder approximation for the joint effect of un-ionized and ionized compound portions.
Experimental pK a data for 16 aliphatic carboxylic acids are compared with calculated proton-transfer energies in the gas phase and in aqueous solution. The calculations are performed at the SCF and MP2 levels with inclusion of SCF-level entropic and thermochemical corrections to yield free energies of dissociation, using the basis sets 6-31G**, 6-31+G**, 6-311G(2d,2p), and 6-311+G(2d,2p) and the recently parametrized continuum-solvation method PCM-UAHF for the solvation contribution. Relative pK a trends are reproduced well with correlation coefficients (adjusted for degrees of freedom) of up to 0.97 and standard errors down to 0.24 log units, while the computational accuracy is not sufficient for predicting absolute proton-transfer energies. The latter is mainly caused by deficiencies of the underlying gas-phase calculations, as is demonstrated by a separate analysis of the gas-phase and solution-phase contributions to pK a .
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.