Although the salt bridge is the strongest among all known noncovalent molecular interactions, no comprehensive studies have been conducted to date to examine its role and significance in drug design. Thus, a systematic study of the salt bridge in biological systems is reported herein, with a broad analysis of publicly available data from Protein Data Bank, DrugBank, ChEMBL, and GPCRdb. The results revealed the distance and angular preferences as well as privileged molecular motifs of salt bridges in ligand–receptor complexes, which could be used to design the strongest interactions. Moreover, using quantum chemical calculations at the MP2 level, the energetic, directionality, and spatial variabilities of salt bridges were investigated using simple model systems mimicking salt bridges in a biological environment. Additionally, natural orbitals for chemical valence (NOCV) combined with the extended-transition-state (ETS) bond-energy decomposition method (ETS–NOCV) were analyzed and indicated a strong covalent contribution to the salt bridge interaction. The present results could be useful for implementation in rational drug design protocols.
This study explores a new approach to pharmacophore screening involving the use of an optimized linear combination of models instead of a single hypothesis. The implementation and evaluation of the developed methodology are performed for a complete known chemical space of 5-HT1AR ligands (3616 active compounds with K i < 100 nM) acquired from the ChEMBL database. Clusters generated from three different methods were the basis for the individual pharmacophore hypotheses, which were assembled into optimal combinations to maximize the different coefficients, namely, MCC, accuracy and recall, to measure the screening performance. Various factors that influence filtering efficiency, including clustering methods, the composition of test sets (random, the most diverse and cluster population-dependent) and hit mode (the compound must fit at least one or two models from a final combination) were investigated. This method outmatched both single hypothesis and random linear combination approaches.
Structural fingerprints and pharmacophore modeling are methodologies that have been used for at least 2 decades in various fields of cheminformatics, from similarity searching to machine learning (ML). Advances in in silico techniques consequently led to combining both these methodologies into a new approach known as the pharmacophore fingerprint. Herein, we propose a high-resolution, pharmacophore fingerprint called Pharmacoprint that encodes the presence, types, and relationships between pharmacophore features of a molecule. Pharmacoprint was evaluated in classification experiments by using ML algorithms (logistic regression, support vector machines, linear support vector machines, and neural networks) and outperformed other popular molecular fingerprints (i.e., ECFP4, Estate, MACCS, PubChem, Substructure, Klekota–Roth, CDK, Extended, and GraphOnly) and the ChemAxon pharmacophoric features fingerprint. Pharmacoprint consisted of 39 973 bits; several methods were applied for dimensionality reduction, and the best algorithm not only reduced the length of the bit string but also improved the efficiency of the ML tests. Further optimization allowed us to define the best parameter settings for using Pharmacoprint in discrimination tests and for maximizing statistical parameters. Finally, Pharmacoprint generated for three-dimensional (3D) structures with defined hydrogens as input data was applied to neural networks with a supervised autoencoder for selecting the most important bits and allowed us to maximize the Matthews correlation coefficient up to 0.962. The results show the potential of Pharmacoprint as a new, perspective tool for computer-aided drug design.
Fluorine is a common substituent in medicinal chemistry and is found in up to 50% of the most profitable drugs. In this study, a statistical analysis of the nature, geometry, and frequency of hydrogen bonds (HBs) formed between the aromatic and aliphatic C–F groups of small molecules and biological targets found in the Protein Data Bank (PDB) repository was presented. Interaction energies were calculated for those complexes using three different approaches. The obtained results indicated that the interaction energy of F-containing HBs is determined by the donor–acceptor distance and not by the angles. Moreover, no significant relationship between the energies of HBs with fluorine and the donor type was found, implying that fluorine is a weak HB acceptor for all types of HB donors. However, the statistical analysis of the PDB repository revealed that the most populated geometric parameters of HBs did not match the calculated energetic optima. In a nutshell, HBs containing fluorine are forced to form due to the stronger ligand–receptor neighboring interactions, which make fluorine the “donor’s last resort”.
Metabolic stability is an important parameter to be optimized during the complex process of designing new active compounds. Tuning this parameter with the simultaneous maintenance of a desired compound’s activity is not an easy task due to the extreme complexity of metabolic pathways in living organisms. In this study, the platform for in silico qualitative evaluation of metabolic stability, expressed as half-lifetime and clearance was developed. The platform is based on the application of machine learning methods and separate models for human, rat and mouse data were constructed. The compounds’ evaluation is qualitative and two types of experiments can be performed—regression, which is when the compound is assigned to one of the metabolic stability classes (low, medium, high) on the basis of numerical value of the predicted half-lifetime, and classification, in which the molecule is directly assessed as low, medium or high stability. The results show that the models have good predictive power, with accuracy values over 0.7 for all cases, for Sequential Minimal Optimization (SMO), k-nearest neighbor (IBk) and Random Forest algorithms. Additionally, for each of the analyzed compounds, 10 of the most similar structures from the training set (in terms of Tanimoto metric similarity) are identified and made available for download as separate files for more detailed manual inspection. The predictive power of the models was confronted with the external dataset, containing metabolic stability assessment via the GUSAR software, leading to good consistency of results for SMOreg and Naïve Bayes (~0.8 on average). The tool is available online.
In a search for new anti-HIV-1 chemotypes, we developed a multistep ligand-based virtual screening (VS) protocol combining machine learning (ML) methods with the privileged structures (PS) concept. In its learning step, the VS protocol was based on HIV integrase (IN) inhibitors fetched from the ChEMBL database. The performances of various ML methods and PS weighting scheme were evaluated and applied as VS filtering criteria. Finally, a database of 1.5 million commercially available compounds was virtually screened using a multistep ligand-based cascade, and 13 selected unique structures were tested by measuring the inhibition of HIV replication in infected cells. This approach resulted in the discovery of two novel chemotypes with moderate antiretroviral activity, that, together with their topological diversity, make them good candidates as lead structures for future optimization.
The growing computational abilities of various tools that are applied in the broadly understood field of computer-aided drug design have led to the extreme popularity of virtual screening in the search for new biologically active compounds. Most often, the source of such molecules consists of commercially available compound databases, but they can also be searched for within the libraries of structures generated in silico from existing ligands. Various computational combinatorial approaches are based solely on the chemical structure of compounds, using different types of substitutions for new molecules formation. In this study, the starting point for combinatorial library generation was the fingerprint referring to the optimal substructural composition in terms of the activity toward a considered target, which was obtained using a machine learning-based optimization procedure. The systematic enumeration of all possible connections between preferred substructures resulted in the formation of target-focused libraries of new potential ligands. The compounds were initially assessed by machine learning methods using a hashed fingerprint to represent molecules; the distribution of their physicochemical properties was also investigated, as well as their synthetic accessibility. The examination of various fingerprints and machine learning algorithms indicated that the Klekota-Roth fingerprint and support vector machine were an optimal combination for such experiments. This study was performed for 8 protein targets, and the obtained compound sets and their characterization are publically available at http://skandal.if-pan.krakow.pl/comb_lib/ .
The complexes of selected long-chain arylpiperazines with homology models of 5-HT 1A , 5-HT 2A , and 5-HT 7 receptors were investigated using quantum mechanical methods. The molecular geometries of the ligand-receptor complexes were firstly optimized with the Our own N-layered Integrated molecular Orbital and molecular Mechanics (ONIOM) method. Next, the fragment molecular orbitals method with an energy decomposition analysis scheme (FMO-EDA) was employed to estimate the interaction energies in binding sites. The results clearly showed that orthosteric binding sites of studied serotonin receptors have both attractive and repulsive regions. In the case of 5-HT 1A and 5-HT 2A two repulsive areas, located in the lower part of the binding pocket, and one large area of attraction engaging many residues at the top of all helices were identified. Additionally, for the 5-HT 7 receptor, the third area of destabilization located at the extracellular end of the helix 6 was found.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.