Traditional approaches to specifying a molecular mechanics force field encode all the information needed to assign force field parameters to a given molecule into a discrete set of atom types. This is equivalent to a representation consisting of a molecular graph comprising a set of vertices, which represent atoms labeled by atom type, and unlabeled edges, which represent chemical bonds. Bond stretch, angle bend, and dihedral parameters are then assigned by looking up bonded pairs, triplets, and quartets of atom types in parameter tables to assign valence terms, and using the atom types themselves to assign nonbonded parameters. This approach, which we call indirect chemical perception because it operates on the intermediate graph of atom-typed nodes, creates a number of technical problems. For example, atom types must be sufficiently complex to encode all necessary information about the molecular environment, making it difficult to extend force fields encoded this way. Atom typing also results in a proliferation of redundant parameters applied to chemically equivalent classes of valence terms, needlessly increasing force field complexity. Here, we describe a new approach to assigning force field parameters terms direct chemical perception that avoids these problems, called the SMIRKS Native Open Force Field (SMIRNOFF) format. Rather than working through the intermediary of the atomtyped graph, direct chemical perception operates directly on the unmodified chemical graph of the molecule to assign parameters. In particular, parameters are assigned to each type of force field term—e.g., bond stretch, angle bend, torsion, and Lennard-Jones—based on standard chemical substructure queries implemented via the industry-standard SMARTS chemical perception language, using SMIRKS extensions that permit labeling of specific atoms within a chemical pattern. We demonstrate the power and generality of this approach using examples of specific molecules that pose problems for indirect chemical perception, and construct and validate a minimalist yet very general force field, SMIRNOFF99Frosst. We find that a parameter definition file only ~300 lines long provides coverage of all but <0.02% of a five million molecule drug-like test set. Despite its simplicity, the accuracy of SMIRNOFF99Frosst for small molecule hydration free energies and selected properties of pure organic liquids, is similar to that of the General Amber Force Field (GAFF), whose specification requires thousands of parameters. This force field provides a starting point for further optimization and refitting work to follow.
We present a methodology for defining and optimizing a general force field for classical molecular simulations, and we describe its use to derive the Open Force Field 1.0.0 smallmolecule force field, codenamed Parsley. Rather than using traditional atom typing, our approach is built on the SMIRKSnative Open Force Field (SMIRNOFF) parameter assignment formalism, which handles increases in the diversity and specificity of the force field definition without needlessly increasing the complexity of the specification. Parameters are optimized with the ForceBalance tool, based on reference quantum chemical data that include torsion potential energy profiles, optimized gas-phase structures, and vibrational frequencies. These quantum reference data are computed and are maintained with QCArchive, an opensource and freely available distributed computing and database software ecosystem. In this initial application of the method, we present essentially a full optimization of all valence parameters and report tests of the resulting force field against compounds and data types outside the training set. These tests show improvements in optimized geometries and conformational energetics and demonstrate that Parsley's accuracy for liquid properties is similar to that of other general force fields, as is accuracy on binding free energies. We find that this initial Parsley force field affords accuracy similar to that of other general force fields when used to calculate relative binding free energies spanning 199 protein−ligand systems. Additionally, the resulting infrastructure allows us to rapidly optimize an entirely new force field with minimal human intervention.
Here, we focus on testing and improving force fields for molecular modeling, which see widespread use in diverse areas of computational chemistry and biomolecular simulation. A key issue affecting the accuracy and transferrability of these force fields is the use of atom typing. Traditional approaches to defining molecular mechanics force fields must encode, within a discrete set of atom types, all information which will ever be needed about the chemical environment; parameters are then assigned by looking up combinations of these atom types in tables. This atom typing approach leads to a wide variety of problems such as inextensible atom-typing machinery, enormous difficulty in expanding parameters encoded by atom types, and unnecessarily proliferation of encoded parameters. Here, we describe a new approach to assigning parameters for molecular mechanics force fields based on the industry standard SMARTS chemical perception language (with extensions to identify specific atoms available in SMIRKS). In this approach, each force field term (bonds, angles, and torsions, and nonbonded interactions) features separate definitions assigned in a hierarchical manner without using atom types. We accomplish this using direct chemical perception, where parameters are assigned directly based on substructure queries operating on the molecule(s) being parameterized, thereby avoiding the intermediate step of assigning atom types -a step which can be considered indirect chemical perception. Direct chemical perception allows for substantial simplification of force fields, as well as additional generality in the substructure queries. This approach is applicable to a wide variety of (bio)molecular systems, and can greatly reduce the number of parameters needed to create a complete force field. Further flexibility can also be gained by allowing force field terms to be interpolated based on the assignment of fractional bond orders via the same procedure used to assign partial charges. As an example of the utility of this approach, we provide a minimalist small molecule force field derived from Merck's parm@Frosst (an Amber parm99 descendant), in which a parameter definition file only ≈300 lines long can parameterize a large and diverse spectrum of pharmaceutically relevant small molecule chemical space. We benchmark this minimalist force field on the FreeSolv small molecule hydration free energy set and calculations of densities and dielectric constants from the ThermoML Archive, demonstrating that it achieves comparable accuracy to the Generalized Amber Force Field (GAFF) that consists of many thousands of parameters.
We describe the structure and optimization of the Open Force Field 1.0.0 small molecule force field, code-named Parsley. Parsley uses the SMIRKS-native Open Force Field (SMIRNOFF) parameter assignment formalism in which parameter types are assigned directly by chemical perception, in contrast to traditional atom type-based approaches. This method provides a natural means to incorporate increasingly diverse chemistry without needlessly increasing force field complexity. In this work, we present essentially a full optimization of the valence parameters in the force field. The optimization was carried out with the ForceBalance tool and was informed by reference quantum chemical data that include torsion potential energy profiles, optimized gas-phase structures, and vibrational frequencies. These data were computed and are maintained with QCArchive, an open-source and freely available distributed computing and database software ecosystem. Tests of the resulting force field against compounds and data types outside the training set show improvements in optimized geometries and conformational energetics and demonstrate that Parsley's accuracy for liquid properties is similar to that of other general force fields. <br>
Background: Force fields are used in a wide variety of contexts for classical molecular simulation, including studies on protein-ligand binding, membrane permeation, and thermophysical property prediction. The quality of these studies relies on the quality of the force fields used to represent the systems. Methods: Focusing on small molecules of fewer than 50 heavy atoms, our aim in this work is to compare nine force fields: GAFF, GAFF2, MMFF94, MMFF94S, OPLS3e, SMIRNOFF99Frosst, and the Open Force Field Parsley, versions 1.0, 1.1, and 1.2. On a dataset comprising 22,675 molecular structures of 3,271 molecules, we analyzed force field-optimized geometries and conformer energies compared to reference quantum mechanical (QM) data. Results: We show that while OPLS3e performs best, the latest Open Force Field Parsley release is approaching a comparable level of accuracy in reproducing QM geometries and energetics for this set of molecules. Meanwhile, the performance of established force fields such as MMFF94S and GAFF2 is generally somewhat worse. We also find that the series of recent Open Force Field versions provide significant increases in accuracy. Conclusions: This study provides an extensive test of the performance of different molecular mechanics force fields on a diverse molecule set, and highlights two (OPLS3e and OpenFF 1.2) that perform better than the others tested on the present comparison. Our molecule set and results are available for other researchers to use in testing.
The human voltage-gated proton channel Hv1 is a drug target for cancer, ischemic stroke, and neuroinflammation. It resides on the plasma membrane and endocytic compartments of a variety of cell types, where it mediates outward proton movement and regulates the activity of NOX enzymes. Its voltage-sensing domain (VSD) contains a gated and proton-selective conduction pathway, which can be blocked by aromatic guanidine derivatives such as 2-guanidinobenzimidazole (2GBI). Mutation of Hv1 residue F150 to alanine (F150A) was previously found to increase 2GBI apparent binding affinity more than two orders of magnitude. Here, we explore the contribution of aromatic interactions between the inhibitor and the channel in the presence and absence of the F150A mutation, using a combination of electrophysiological recordings, classic mutagenesis, and site-specific incorporation of fluorinated phenylalanines via nonsense suppression methodology. Our data suggest that the increase in apparent binding affinity is due to a rearrangement of the binding site allowed by the smaller residue at position 150. We used this information to design new arginine mimics with improved affinity for the nonrearranged binding site of the wild-type channel. The new compounds, named “Hv1 Inhibitor Flexibles” (HIFs), consist of two “prongs,” an aminoimidazole ring, and an aromatic group connected by extended flexible linkers. Some HIF compounds display inhibitory properties that are superior to those of 2GBI, thus providing a promising scaffold for further development of high-affinity Hv1 inhibitors.
Accurate hydrogen placement in molecular modeling is crucial for studying the interactions and dynamics of biomolecular systems. It is difficult to locate hydrogen atoms from many experimental structural characterization approaches, such as due to the weak scattering of x-ray radiation. Hydrogen atoms are usually added and positioned in silico when preparing experimental structures for modeling and simulation. The carboxyl functional group is a prototypical example of a functional group that requires protonation during structure preparation. To our knowledge, when in their neutral form, carboxylic acids are typically protonated in the syn conformation by default in classical molecular modeling packages, with no consideration of alternative conformations, though we are not aware of any careful examination of this topic. Here, we investigate the general belief that carboxylic acids should always be protonated in the syn conformation. We calculate and compare the relative energetic stabilities of syn and anti acetic acid using ab initio quantum mechanical calculations and atomistic molecular dynamics simulations. We show that while the syn conformation is the preferred state, the anti state may in some cases also be present under normal NPT conditions in solution.
<div>Force fields are used in a wide variety of contexts for classical molecular simulation, including studies on protein-ligand binding, membrane permeation, and thermophysical property prediction. The quality of these studies relies on the quality of the force fields used to represent the systems. </div><div>Focusing on small molecules of fewer than 50 heavy atoms, our aim in this work is to compare nine force fields: GAFF, GAFF2, MMFF94, MMFF94S, OPLS3e, SMIRNOFF99Frosst, and the Open Force Field Parsley, versions 1.0, 1.1 and 1.2. On a dataset comprising 22,675 molecular structures of 3,271 molecules, we analyzed force field-optimized geometries and conformer energies compared these to reference quantum mechanical (QM) data. We show that while OPLS3e performs best, the latest Open Force Field Parsley release is approaching a comparable level of accuracy in reproducing QM geometries and energetics for this set of molecules. Meanwhile, the performance of established force fields such as MMFF94s and GAFF2 is generally somewhat worse. We also find that the series of recent Open Force Field versions provide significant increases in accuracy. Our molecule set and results are available for other researchers to use in testing.</div>
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.