A molecular mechanics force field implemented in the Sybyl program is described along with a statistical evaluation of its efficiency on a variety of compounds by analysis of internal coordinates and thermodynamic barriers. The goal of the force field is to provide good quality geometries and relative energies for a large variety of organic molecules by energy minimization. Performance in protein modeling was tested by minimizations starting from crystallographic coordinates for three cyclic hexapeptides in the crystal lattice with rms movements of 0.019 angstroms, 2.06 degrees, and 6.82 degrees for bond lengths, angles, and torsions, respectively, and an rms movement of 0.16 angstroms for heavy atoms. Isolated crambin was also analyzed with rms movements of 0.025 angstroms, 2.97 degrees, and 13.0 degrees for bond lengths, angles, and torsions respectively, and an rms movement of 0.42 angstroms for heavy atoms. Accuracy in calculating thermodynamic barriers was tested for 17 energy differences between conformers, 12 stereoisomers, and 15 torsional barriers. The rms errors were 0.8, 1.7, and 1.13 kcal/mol, respectively, for the three tests. Performance in general purpose applications was assessed by minimizing 76 diverse complex organic crystal structures, with and without randomization by coordinate truncation, with rms movements of 0.025 angstroms, 2.50 degrees, and 9.54 degrees for bond lengths, angles and torsions respectively, and an average rms movement of 0.192 angstroms for heavy atoms.
Quantitative Structure-Activity Relationship modeling is one of the major computational tools employed in medicinal chemistry. However, throughout its entire history it has drawn both praise and criticism concerning its reliability, limitations, successes, and failures. In this paper, we discuss: (i) the development and evolution of QSAR; (ii) the current trends, unsolved problems, and pressing challenges; and (iii) several novel and emerging applications of QSAR modeling. Throughout this discussion, we provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive QSAR models. We hope that this Perspective will help communications between computational and experimental chemists towards collaborative development and use of QSAR models. We also believe that the guidelines presented here will help journal editors and reviewers apply more stringent scientific standards to manuscripts reporting new QSAR studies, as well as encourage the use of high quality, validated QSARs for regulatory decision making.
To better evaluate, in the context of QSAR studies, new validation techniques such as bootstrapping and crossvalidation and the new analytic technique of partial least squares (PLS), seventeen QSAR results taken from nine recent publications were reexamined using these techniques. The results indicate that bootstrapping and crossvalidation are more powerful indicators of possible chance correlation than are the classical tests based on assumed normal independent distribution of variables. Although PLS will not detect all correlations existing within a set of data, its conservative behavior is particularly valuable when the candidate physicochemical descriptors are numerous and non‐orthogonal.
When searching for new leads, testing molecules that are too "similar" is wasteful, but when investigating a lead, testing molecules that are "similar" to the lead is efficient. Two questions then arise. Which are the molecular descriptors that should be "similar"? How much "similarity" is enough? These questions are answered by demonstrating that, if a molecular descriptor is to be a valid and useful measure of "similarity" in drug discovery, a plot of differences in its values vs differences in biological activities for a set of related molecules will exhibit a characteristic trapezoidal distribution enhancement, revealing a "neighborhood behavior" for the descriptor. Applying this finding to 20 datasets allows 11 molecular diversity descriptors to be ranked by their validity for compound library design. In order of increasing frequency of usefulness, these are random numbers = log P = MR = strain energy < connectivity indices < 2D fingerprints (whole molecule) = atom pairs = autocorrelation indices < steric CoMFA fields = 2D fingerprints (side chain only) = H-bonding CoMFA fields.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.