With the rapidly increasing availability of High-Throughput Screening (HTS) data in the public domain, such as the PubChem database, methods for ligand-based computer-aided drug discovery (LB-CADD) have the potential to accelerate and reduce the cost of probe development and drug discovery efforts in academia. We assemble nine data sets from realistic HTS campaigns representing major families of drug target proteins for benchmarking LB-CADD methods. Each data set is public domain through PubChem and carefully collated through confirmation screens validating active compounds. These data sets provide the foundation for benchmarking a new cheminformatics framework BCL::ChemInfo, which is freely available for non-commercial use. Quantitative structure activity relationship (QSAR) models are built using Artificial Neural Networks (ANNs), Support Vector Machines (SVMs), Decision Trees (DTs), and Kohonen networks (KNs). Problem-specific descriptor optimization protocols are assessed including Sequential Feature Forward Selection (SFFS) and various information content measures. Measures of predictive power and confidence are evaluated through cross-validation, and a consensus prediction scheme is tested that combines orthogonal machine learning algorithms into a single predictor. Enrichments ranging from 15 to 101 for a TPR cutoff of 25% are observed.
Manganese (Mn) is both an essential biological cofactor and neurotoxicant. Disruption of Mn biology in the basal ganglia has been implicated in the pathogenesis of neurodegenerative disorders, such as parkinsonism and Huntington's disease. Handling of other essential metals (e.g. iron and zinc) occurs via complex intracellular signaling networks that link metal detection and transport systems. However, beyond several non-selective transporters, little is known about the intracellular processes regulating neuronal Mn homeostasis. We hypothesized that small molecules that modulate intracellular Mn could provide insight into cell-level Mn regulatory mechanisms. We performed a high throughput screen of 40,167 small molecules for modifiers of cellular Mn content in a mouse striatal neuron cell line. Following stringent validation assays and chemical informatics, we obtained a chemical ‘toolbox' of 41 small molecules with diverse structure-activity relationships that can alter intracellular Mn levels under biologically relevant Mn exposures. We utilized this toolbox to test for differential regulation of Mn handling in human floor-plate lineage dopaminergic neurons, a lineage especially vulnerable to environmental Mn exposure. We report differential Mn accumulation between developmental stages and stage-specific differences in the Mn-altering activity of individual small molecules. This work demonstrates cell-level regulation of Mn content across neuronal differentiation.
Discoidin domain receptor (DDR) 1 and 2 are transmembrane receptors that belong to the family of receptor tyrosine kinases (RTK). Upon collagen binding, DDRs transduce cellular signaling involved in various cell functions, including cell adhesion, proliferation, differentiation, migration, and matrix homeostasis. Altered DDR function resulting from either mutations or overexpression has been implicated in several types of disease, including atherosclerosis, inflammation, cancer, and tissue fibrosis. Several established inhibitors, such as imatinib, dasatinib, and nilotinib, originally developed as Abelson murine leukemia (Abl) kinase inhibitors, have been found to inhibit DDR kinase activity. As we review here, recent discoveries of novel inhibitors and their co-crystal structure with the DDR1 kinase domain have made structure-based drug discovery for DDR1 amenable.
Small angle X-ray scattering (SAXS) is used for low resolution structural characterization of proteins often in combination with other experimental techniques. After briefly reviewing the theory of SAXS we discuss computational methods based on 1) the Debye equation and 2) Spherical Harmonics to compute intensity profiles from a particular macromolecular structure. Further, we review how these formulas are parameterized for solvent density and hydration shell adjustment. Finally we introduce our solution to compute SAXS profiles utilizing GPU acceleration.
The BioChemical Library (BCL) cheminformatics toolkit is an application-based academic open-source software package designed to integrate traditional small molecule cheminformatics tools with machine learning-based quantitative structure-activity/property relationship (QSAR/QSPR) modeling. In this pedagogical article we provide a detailed introduction to core BCL cheminformatics functionality, showing how traditional tasks (e.g., computing chemical properties, estimating druglikeness) can be readily combined with machine learning. In addition, we have included multiple examples covering areas of advanced use, such as reaction-based library design. We anticipate that this manuscript will be a valuable resource for researchers in computer-aided drug discovery looking to integrate modular cheminformatics and machine learning tools into their pipelines.
Peptidylglycine α-amidating monooxygenase (PAM) is a bifunctional enzyme which catalyzes the post-translational modification of inactive C-terminal glycine-extended peptide precursors to the corresponding bioactive α-amidated peptide hormone. This conversion involves two sequential reactions both of which are catalyzed by the separate catalytic domains of PAM. The first step, the copper-, ascorbate-, and O2-dependent stereospecific hydroxylation at the α-carbon of the C-terminal glycine, is catalyzed by peptidylglycine α-hydroxylating monooxygenase (PHM). The second step, the zinc-dependent dealkylation of the carbinolamide intermediate, is catalyzed by peptidylglycine amidoglycolate lyase. Quantum mechanical tunneling dominates PHM–dependant Cα-H bond activation. This study probes the substrate structure dependence of this chemistry using a set of N-acylglycine substrates of varying hydrophobicity. Primary deuterium kinetic isotope effects (KIEs), molecular mechanical docking, alchemical free energy perturbation, and equilibrium molecular dynamics were used to study the role played by ground-state substrate structure on PHM catalysis. Our data show that all N-acylglycines bind sequentially to PHM in an equilibrium-ordered fashion. The primary deuterium KIE displays a linear decrease with respect acyl chain length for straight-chain N-acylglycine substrates. Docking orientation of these substrates displayed increased dissociation energy proportional to hydrophobic pocket interaction. The decrease in KIE with hydrophobicity was attributed to a pre-organization event which decreased reorganization energy by decreasing the conformational sampling associated with ground state substrate binding. This is the first example of pre-organization in the family of non-coupled copper monooxygenases.
Availability of high-throughput screening (HTS) data in the public domain offers great potential to foster development of ligand-based computer-aided drug discovery (LB-CADD) methods crucial for drug discovery efforts in academia and industry. LB-CADD method development depends on high-quality HTS assay data, i.e., datasets that contain both active and inactive compounds. These active compounds are hits from primary screens that have been tested in concentration-response experiments and where the target-specificity of the hits has been validated through suitable secondary screening experiments. Publicly available HTS repositories such as PubChem often provide such data in a convoluted way: compounds that are classified as inactive need to be extracted from the primary screening record. However, compounds classified as active in the primary screening record are not suitable as a set of active compounds for LB-CADD experiments due to high false-positive rate. A suitable set of actives can be derived by carefully analysing results in often up to five or more assays that are used to confirm and classify the activity of compounds. These assays, in part, build on each other. However, often not all hit compounds from the previous screen have been tested. Sometimes a compound can be classified as ‘active’, though its meaning is ‘inactive’ on the target of interest as it is ‘active’ on a different target protein. Here, a curation process of hierarchically related confirmatory screens is illustrated based on two specifically chosen protein use-cases. The subsequent re-upload procedure into PubChem is described for the findings of those two scenarios. Further, we provide nine publicly accessible high quality datasets for future LB-CADD method development that provide a common baseline for comparison of future methods to the scientific community. We also provide a protocol researchers can follow to upload additional datasets for benchmarking.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.