Eric Boittier scite author profile

A promising protein target for computational drug development, the human cluster of differentiation 38 (CD38), plays a crucial role in many physiological and pathological processes, primarily through the upstream regulation of factors that control cytoplasmic Ca2+ concentrations. Recently, a small-molecule inhibitor of CD38 was shown to slow down pathways relating to aging and DNA damage. We examined the performance of seven docking programs for their ability to model protein-ligand interactions with CD38. A test set of twelve CD38 crystal structures, containing crystallized biologically relevant substrates, were used to assess pose prediction. The rankings for each program based on the median RMSD between the native and predicted were Vina, AD4 > PLANTS, Gold, Glide, Molegro > rDock. Forty-two compounds with known affinities were docked to assess the accuracy of the programs at affinity/ranking predictions. The rankings based on scoring power were: Vina, PLANTS > Glide, Gold > Molegro >> AutoDock 4 >> rDock. Out of the top four performing programs, Glide had the only scoring function that did not appear to show bias towards overpredicting the affinity of the ligand-based on its size. Factors that affect the reliability of pose prediction and scoring are discussed. General limitations and known biases of scoring functions are examined, aided in part by using molecular fingerprints and Random Forest classifiers. This machine learning approach may be used to systematically diagnose molecular features that are correlated with poor scoring accuracy.

show abstract

Transfer Learning to CCSD(T): Accurate Anharmonic Frequencies from Machine Learning Models

Käser

Boittier

Upadhyay

et al. 2021

J. Chem. Theory Comput.

View full text Add to dashboard Cite

The calculation of the anharmonic modes of small- to medium-sized molecules for assigning experimentally measured frequencies to the corresponding type of molecular motions is computationally challenging at sufficiently high levels of quantum chemical theory. Here, a practical and affordable way to calculate coupled-cluster quality anharmonic frequencies using second-order vibrational perturbation theory (VPT2) from machine-learned models is presented. The approach, referenced as “NN + VPT2”, uses a high-dimensional neural network (PhysNet) to learn potential energy surfaces (PESs) at different levels of theory from which harmonic and VPT2 frequencies can be efficiently determined. The NN + VPT2 approach is applied to eight small- to medium-sized molecules (H2CO, trans-HONO, HCOOH, CH3OH, CH3CHO, CH3NO2, CH3COOH, and CH3CONH2) and frequencies are reported from NN-learned models at the MP2/aug-cc-pVTZ, CCSD(T)/aug-cc-pVTZ, and CCSD(T)-F12/aug-cc-pVTZ-F12 levels of theory. For the largest molecules and at the highest levels of theory, transfer learning (TL) is used to determine the necessary full-dimensional, near-equilibrium PESs. Overall, NN + VPT2 yields anharmonic frequencies to within 20 cm–1 of experimentally determined frequencies for close to 90% of the modes for the highest quality PES available and to within 10 cm–1 for more than 60% of the modes. For the MP2 PESs only ∼60% of the NN + VPT2 frequencies were within 20 cm–1 of the experiment, with outliers up to ∼150 cm–1, compared to the experiment. It is also demonstrated that the approach allows to provide correct assignments for strongly interacting modes such as the OH bending and the OH torsional modes in formic acid monomer and the CO-stretch and OH-bend mode in acetic acid.

show abstract

Elevating CDCA3 levels in non-small cell lung cancer enhances sensitivity to platinum-based chemotherapy

et al. 2021

View full text Add to dashboard Cite

Platinum-based chemotherapy remains the cornerstone of treatment for most non-small cell lung cancer (NSCLC) cases either as maintenance therapy or in combination with immunotherapy. However, resistance remains a primary issue. Our findings point to the possibility of exploiting levels of cell division cycle associated protein-3 (CDCA3) to improve response of NSCLC tumours to therapy. We demonstrate that in patients and in vitro analyses, CDCA3 levels correlate with measures of genome instability and platinum sensitivity, whereby CDCA3high tumours are sensitive to cisplatin and carboplatin. In NSCLC, CDCA3 protein levels are regulated by the ubiquitin ligase APC/C and cofactor Cdh1. Here, we identified that the degradation of CDCA3 is modulated by activity of casein kinase 2 (CK2) which promotes an interaction between CDCA3 and Cdh1. Supporting this, pharmacological inhibition of CK2 with CX-4945 disrupts CDCA3 degradation, elevating CDCA3 levels and increasing sensitivity to platinum agents. We propose that combining CK2 inhibitors with platinum-based chemotherapy could enhance platinum efficacy in CDCA3low NSCLC tumours and benefit patients.

show abstract

Impact of the Characteristics of Quantum Chemical Databases on Machine Learning Prediction of Tautomerization Energies

Vazquez-Salazar

Boittier

Unke

et al. 2021

J. Chem. Theory Comput.

View full text Add to dashboard Cite

An essential aspect for adequate predictions of chemical properties by machine learning models is the database used for training them. However, studies that analyze how the content and structure of the databases used for training impact the prediction quality are scarce. In this work, we analyze and quantify the relationships learned by a machine learning model (Neural Network) trained on five different reference databases (QM9, PC9, ANI-1E, ANI-1, and ANI-1x) to predict tautomerization energies from molecules in Tautobase. For this, characteristics such as the number of heavy atoms in a molecule, number of atoms of a given element, bond composition, or initial geometry on the quality of the predictions are considered. The results indicate that training on a chemically diverse database is crucial for obtaining good results and also that conformational sampling can partly compensate for limited coverage of chemical diversity. The overall best-performing reference database (ANI-1x) performs on average by 1 kcal/mol better than PC9, which, however, contains about 2 orders of magnitude fewer reference structures. On the other hand, PC9 is chemically more diverse by a factor of ∼5 as quantified by the number of atom-in-molecule-based fragments (amons) it contains compared with the ANI family of databases. A quantitative measure for deficiencies is the Kullback–Leibler divergence between reference and target distributions. It is explicitly demonstrated that when certain types of bonds need to be covered in the target database (Tautobase) but are undersampled in the reference databases, the resulting predictions are poor. Examples of this include the poor performance of all databases analyzed to predict C(sp2)–C(sp2) double bonds close to heteroatoms and azoles containing N–N and N–O bonds. Analysis of the results with a Tree MAP algorithm provides deeper understanding of specific deficiencies in predicting tautomerization energies by the reference datasets due to inadequate coverage of chemical space. Capitalizing on this information can be used to either improve existing databases or generate new databases of sufficient diversity for a range of machine learning (ML) applications in chemistry.

show abstract

GlycoTorch Vina: Docking Designed and Tested for Glycosaminoglycans

Boittier

Burns

Gandhi

et al. 2020

J. Chem. Inf. Model.

View full text Add to dashboard Cite

Glycosaminoglycans (GAGs) are a family of anionic carbohydrates that play an essential role in the physiology and pathology of all eukaryotic life forms. Experimental determination of GAG–protein complexes is challenging due to their difficult isolation from biological sources, natural heterogeneity, and conformational flexibilityincluding possible ring puckering of sulfated iduronic acid from 1C4 to 2SO conformation. To overcome these challenges, we present GlycoTorch Vina (GTV), a molecular docking tool based on the carbohydrate docking program VinaCarb (VC). Our program is unique in that it contains parameters to model 2SO sugars while also supporting glycosidic linkages specific to GAGs. We discuss how crystallographic models of carbohydrates can be biased by the choice of refinement software and structural dictionaries. To overcome these variations, we carefully curated 12 of the best available GAG and GAG-like crystal structures (ranging from tetra- to octasaccharides or longer) obtained from the PDB-REDO server and refined using the same protocol. Both GTV and VC produced pose predictions with a mean root-mean-square deviation (RMSD) of 3.1 Å from the native crystal structurea statistically significant improvement when compared to AutoDock Vina (4.5 Å) and the commercial software Glide (5.9 Å). Examples of how real-space correlation coefficients can be used to better assess the accuracy of docking pose predictions are given. Comparisons between statistical distributions of empirical “salt bridge” interactions, relevant to GAGs, were compared to density functional theory (DFT) studies of model salt bridges, and water-mediated salt bridges; however, there was generally a poor agreement between these data. Water bridges appear to play an important, yet poorly understood, role in the structures of GAG–protein complexes. To aid in the rapid prototyping of future pose scoring functions, we include a module that allows users to include their own torsional and nonbonded parameters.

show abstract

Pathway Bifurcation in the (4 + 3)/(5 + 2)-Cycloaddition of Butadiene and Oxidopyrylium Ylides: The Significance of Molecular Orbital Isosymmetry

Burns

Boittier

2019

J. Org. Chem.

View full text Add to dashboard Cite

By drawing analogies from the dimerization of cyclopentadiene, a novel reaction pathway bifurcation is uncovered in the cycloaddition of oxidopyrylium ylides and butadiene. Analysis of the potential energy surface (at the M06-2X/6-311+G(d,p) level of theory) in combination with Born−Oppenheimer molecular dynamics simulations (M06-2X/6-31+G(d)) demonstrate that both the (4 + 3)-and (5 + 2)-cycloaddition products are accessed from the same transition state. Key indicators of a pathway bifurcation (asynchronous bond formation, and a second transition state for the interconversion of the products) are also observed. The absence of a post-transition state bifurcation in the related oxidopyridinium systems of Krenske and Harmata is rationalized. Finally, the isosymmetry of the oxidopyrylium and cyclopentadiene molecular orbitals as well as the presence of "secondary orbital interactions" are emphasized as the common source of nonstatistical behavior. Application of these principles will allow for the rapid identification of new reaction pathway bifurcations.

show abstract

The first HyDRA challenge for computational vibrational spectroscopy

Fischer¹,

Bödecker²,

Schweer³

et al. 2023

Phys. Chem. Chem. Phys.

View full text Add to dashboard Cite

show abstract

Cross-Species Analysis of Glycosaminoglycan Binding Proteins Reveals Some Animal Models Are “More Equal” than Others

et al. 2019

View full text Add to dashboard Cite

Glycosaminoglycan (GAG) mimetics are synthetic or semi-synthetic analogues of heparin or heparan sulfate, which are designed to interact with GAG binding sites on proteins. The preclinical stages of drug development rely on efficacy and toxicity assessment in animals and aim to apply these findings to clinical studies. However, such data may not always reflect the human situation possibly because the GAG binding site on the protein ligand in animals and humans could differ. Possible inter-species differences in the GAG-binding sites on antithrombin III, heparanase, and chemokines of the CCL and CXCL families were examined by sequence alignments, molecular modelling and assessment of surface electrostatic potentials to determine if one species of laboratory animal is likely to result in more clinically relevant data than another. For each protein, current understanding of GAG binding is reviewed from a protein structure and function perspective. This combinatorial analysis shows chemokine dimers and oligomers can present different GAG binding surfaces for the same target protein, whereas a cleft-like GAG binding site will differently influence the types of GAG structures that bind and the species preferable for preclinical work. Such analyses will allow an informed choice of animal(s) for preclinical studies of GAG mimetic drugs.

show abstract

12 3

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Eric Boittier

Assessing Molecular Docking Tools to Guide Targeted Drug Discovery of CD38 Inhibitors

Transfer Learning to CCSD(T): Accurate Anharmonic Frequencies from Machine Learning Models

Elevating CDCA3 levels in non-small cell lung cancer enhances sensitivity to platinum-based chemotherapy

Impact of the Characteristics of Quantum Chemical Databases on Machine Learning Prediction of Tautomerization Energies

GlycoTorch Vina: Docking Designed and Tested for Glycosaminoglycans

Pathway Bifurcation in the (4 + 3)/(5 + 2)-Cycloaddition of Butadiene and Oxidopyrylium Ylides: The Significance of Molecular Orbital Isosymmetry

The first HyDRA challenge for computational vibrational spectroscopy

Cross-Species Analysis of Glycosaminoglycan Binding Proteins Reveals Some Animal Models Are “More Equal” than Others

Contact Info

Product

Resources

About