This paper deals with the problem of evaluating the predictive and extrapolative ability of QSPR models by using the PaDEL-descriptor software. In this sense, the selections of training and external data sets were modified considering the molecular weight. Two criteria were used to evaluate the extapolative ability: high correlation factor (R 2 ) (criterion I) and positive ΔR 2 and high R 2 (criterion II). Based on internal and external validation, it is shown that criterion II has a better performance than criterion I. Other selection criteria were found by considering the maximum square correlation coefficient (Q 2 ext ) or the minimum standard deviation (σext) for the external set with high R 2 values. These facts are supported by a systematic variation of the correlation factor (VCF) and the variation of correlation coefficients (VCC), as analysis tools proposed in this article. The methodology was successfully applied to critical temperature (Tc) estimation of linear alkanes and aromatic compounds, considering extrapolation to the heaviest compounds. Descriptors obtained for studied cases, using the criterion II, are in some way in agreement with group contribution and QSPR methods from literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.