In this paper, we compare the most popular Atom-to-Atom Mapping (AAM) tools: ChemAxon, [1] Indigo, [2] RDTool, [3] NameRXN (NextMove), [4] and RXNMapper [5] which implement different AAM algorithms. An open-source RDTool program was optimized, and its modified version ("new RDTool") was considered together with several consensus mapping strategies. The Condensed Graph of Reaction approach was used to calculate chemical distances and develop the "AAM fixer" algorithm for an automatized correction of erroneous mapping. The benchmarking calculations were performed on a Golden dataset containing 1851 manually mapped and curated reactions. The best performing RXNMapper program together with the AMM Fixer was applied to map the USPTO database. The Golden dataset, mapped USPTO and optimized RDTool are available in the GitHub repository https://github.com/Laboratoire-de-Chemoinformatique.
Nowadays, the problem of the model’s applicability domain (AD) definition is an active research topic in chemoinformatics. Although many various AD definitions for the models predicting properties of molecules (Quantitative Structure-Activity/Property Relationship (QSAR/QSPR) models) were described in the literature, no one for chemical reactions (Quantitative Reaction-Property Relationships (QRPR)) has been reported to date. The point is that a chemical reaction is a much more complex object than an individual molecule, and its yield, thermodynamic and kinetic characteristics depend not only on the structures of reactants and products but also on experimental conditions. The QRPR models’ performance largely depends on the way that chemical transformation is encoded. In this study, various AD definition methods extensively used in QSAR/QSPR studies of individual molecules, as well as several novel approaches suggested in this work for reactions, were benchmarked on several reaction datasets. The ability to exclude wrong reaction types, increase coverage, improve the model performance and detect Y-outliers were tested. As a result, several “best” AD definitions for the QRPR models predicting reaction characteristics have been revealed and tested on a previously published external dataset with a clear AD definition problem.
A new water-soluble pillar[5]arene with an amide fragment and triethylammonium groups was synthesized by our original method of aminolysis of the ester groups. Using UV-spectroscopy, it is shown that cationic pillar[5]arenes are able to selectively form 1 : 1 complexes with some hydrophobic anions: the guests with bulky uncharged or negatively charged substituents hindering entry into the macrocycle cavity. Highly selective binding of the most lipophilic guest, methyl orange dye, in the form of organic anion salts by positively charged water-soluble pillar[5]arenes was detected. In the case of the azo dye the appropriate Kass values were 10-100-fold higher than those calculated for the other sulfonic acid derivatives studied. The 2D NMR NOESY (1)H-(1)H spectroscopy confirms the formation of the inclusion complex: negative charge sulfonate head is outside the cavity of pillar[5]arenes and the hydrophobic fragment of the guest is located in the cavity.
In this article, we consider cross-validation of the quantitative structure-property relationship models for reactions and show that the conventional k-fold crossvalidation (CV) procedure gives an 'optimistically' biased assessment of prediction performance. To address this issue, we suggest two strategies of model cross-validation, 'transformation-out' CV, and 'solvent-out' CV. Unlike the conventional k-fold cross-validation approach that does not consider the nature of objects, the proposed procedures provide an unbiased estimation of the predictive performance of the models for novel types of structural transformations in chemical reactions and reactions going under new conditions. Both the suggested strategies have been applied to predict the rate constants of bimolecular elimination and nucleophilic substitution reactions, and Diels-Alder cycloaddition. All suggested cross-validation methodologies and tutorial are implemented in the open-source software package CIMtools (https://github.com/cimmkzn/CIMtools).
Pharmacophore modeling is usually considered as a special type of virtual screening without probabilistic nature. Correspondence of at least one conformation of a molecule to pharmacophore is considered as evidence of its bioactivity. We show that pharmacophores can be treated as one-class machine learning models, and the probability the reflecting model’s confidence can be assigned to a pharmacophore on the basis of their precision of active compounds identification on a calibration set. Two schemes (Max and Mean) of probability calculation for consensus prediction based on individual pharmacophore models were proposed. Both approaches to some extent correspond to commonly used consensus approaches like the common hit approach or the one based on a logical OR operation uniting hit lists of individual models. Unlike some known approaches, the proposed ones can rank compounds retrieved by multiple models. These approaches were benchmarked on multiple ChEMBL datasets used for ligand-based pharmacophore modeling and externally validated on corresponding DUD-E datasets. The influence of complexity of pharmacophores and their performance on a calibration set on results of virtual screening was analyzed. It was shown that Max and Mean approaches have superior early enrichment to the commonly used approaches. Thus, a well-performing, easy-to-implement, and probabilistic alternative to existing approaches for pharmacophore-based virtual screening was proposed.
Study of molecules adsorption on charged surfaces is important for biologically relevant substances where the potential at the interface such as living cell membrane is a significant parameter in the processes of their transportation or transmembrane penetration. In this work, a hybrid optical/electrochemical surface‐enhanced Raman scattering (SERS) technique was applied to get new insight into the adsorption state and conformational equilibrium of neocuproine, which serves as a nucleic acid biosensor in clinical diagnostics and has biological activity towards several types of carcinoma. The density functional theory calculations performed for several rotational conformations and their anion radicals were used to determine the geometrical and energetic characteristics, to evaluate the rotational barrier, to obtain the vibrational assignment, and to consider the metal‐adsorbate charge transfer. The dependence of SERS spectra on surface potential is ascribed to a change of the rotational dynamics of methyl groups from hindered to almost free at potentials ≤−200 mV. It is demonstrated for the first time that SERS spectroscopy is capable to recognize the surface species, which differ in the methyl group internal rotation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.