Computer-aided research on the relationship between molecular structures of natural compounds (NC) and their biological activities have been carried out extensively because the molecular structures of new drug candidates are usually analogous to or derived from the molecular structures of NC. In order to express the relationship physically realistically using a computer, it is essential to have a molecular descriptor set that can adequately represent the characteristics of the molecular structures belonging to the NC's chemical space. Although several topological descriptors have been developed to describe the physical, chemical, and biological properties of organic molecules, especially synthetic compounds, and have been widely used for drug discovery researches, these descriptors have limitations in expressing NC-specific molecular structures. To overcome this, we developed a novel molecular fingerprint, called Natural Compound Molecular Fingerprints (NC-MFP), for explaining NC structures related to biological activities and for applying the same for the natural product (NP)-based drug development. NC-MFP was developed to reflect the structural characteristics of NCs and the commonly used NP classification system. NC-MFP is a scaffold-based molecular fingerprint method comprising scaffolds, scaffold-fragment connection points (SFCP), and fragments. The scaffolds of the NC-MFP have a hierarchical structure. In this study, we introduce 16 structural classes of NPs in the Dictionary of Natural Product database (DNP), and the hierarchical scaffolds of each class were calculated using the Bemis and Murko (BM) method. The scaffold library in NC-MFP comprises 676 scaffolds. To compare how well the NC-MFP represents the structural features of NCs compared to the molecular fingerprints that have been widely used for organic molecular representation, two kinds of binary classification tasks were performed. Task I is a binary classification of the NCs in commercially available library DB into a NC or synthetic compound. Task II is classifying whether NCs with inhibitory activity in seven biological target proteins are active or inactive. Two tasks were developed with some molecular fingerprints, including NC-MFP, using the 1-nearest neighbor (1-NN) method. The performance of task I showed that NC-MFP is a practical molecular fingerprint to classify NC structures from the data set compared with other molecular fingerprints. Performance of task II with NC-MFP outperformed compared with other molecular fingerprints, suggesting that the NC-MFP is useful to explain NC structures related to biological activities. In conclusion, NC-MFP is a robust molecular fingerprint in classifying NC structures and explaining the biological activities of NC
Drug-induced liver injury (DILI) is one of the major reasons for termination of drug development. Due to the importance of predicting DILI in early phases of drug development, diverse in silico models have been developed to filter out DILI-causing candidates before clinical study. However, no computational models have achieved sufficient prediction power for screening DILI in early phases because 1) drugs often cause liver injury through reactive metabolites, 2) different clinical outcomes of DILI have different mechanisms, and 3) the DILI label on drugs is not clearly defined. In this study, we developed binary classification models to predict drug-induced cholestasis, cirrhosis, hepatitis, and steatosis based on the structure of drugs and their metabolites. DILIpositive data was obtained from post-market reports of drugs and DILI-negative data from DILIrank, a database curated by the Food and Drug Administration (FDA). Support vector machine (SVM) and random forest (RF) were used in developing models with nine fingerprints and one 2D molecular descriptor calculated from drug (152 DILI-positives and 102 DILI-negatives) and drug metabolite (192 DILI-positives and 126 DILI-negatives) structures. Models were developed according to Organisation for Economic Cooperation and Development (OECD) guidelines for quantitative structure-activity relationship (QSAR) validation. Internal and external validation was performed with a randomization test in order to thoroughly examine model predictability and avoid random correlation between structural features and adverse outcomes. The applicability domain was defined with a leverage method for reliable prediction of new chemicals. The best models for each liver disease were selected based on external validation results from drugs (cholestasis: 70%, cirrhosis: 90%, hepatitis: 83%, and steatosis: 85%) and drug metabolites (cholestasis: 86%, cirrhosis: 88%, hepatitis: 86%, and steatosis: 83%) with applicability domain analysis. Compiled data sets were further exploited to derive privileged substructures that were more frequent in DILI-positive sets compared to DILI-negative sets and in drug metabolite structures compared to drug structures with a Morgan fingerprint level 2.
This is the first attempt to perform meta-analysis on assay results in accordance with OECD and US EPA test guidelines forDaphnia magna. This study identified the significant experimental parameter that caused inconsistencies between the assay results from the novel dataset.
Background Dog-associated infections are related to more than 70 human diseases. Given that the health diagnosis of a dog requires expertise of the veterinarian, an artificial intelligence model for detecting dog diseases could significantly reduce time and cost required for a diagnosis and efficiently maintain animal health. Objective We collected normal and multispectral images to develop classification model of each three dog skin diseases (bacterial dermatosis, fungal infection, and hypersensitivity allergic dermatosis). The single models (normal image- and multispectral image-based) and consensus models were developed used to four CNN model architecture (InceptionNet, ResNet, DenseNet, MobileNet) and select well-performed model. Results For single models, such as normal image- or multispectral image-based model, the best accuracies and Matthew’s correlation coefficients (MCCs) for validation data set were 0.80 and 0.64 for bacterial dermatosis, 0.70 and 0.36 for fungal infection, and 0.82 and 0.47 for hypersensitivity allergic dermatosis. For the consensus models, the best accuracies and MCCs for the validation set were 0.89 and 0.76 for the bacterial dermatosis data set, 0.87 and 0.63 for the fungal infection data set, and 0.87 and 0.63 for the hypersensitivity allergic dermatosis data set, respectively, which supported that the consensus models of each disease were more balanced and well-performed. Conclusions We developed consensus models for each skin disease for dogs by combining each best model developed with the normal and multispectral images, respectively. Since the normal images could be used to determine areas suspected of lesion of skin disease and additionally the multispectral images could help confirming skin redness of the area, the models achieved higher prediction accuracy with balanced performance between sensitivity and specificity.
Since many drug development projects fail during clinical trials due to poor ADME properties, it is a wise practice to introduce ADME tests at the early stage of drug discovery. Various experimental and computational methods have been developed to obtain ADME properties in an economical manner in terms of time and cost. As in vitro and in vivo experimental data on ADME have accumulated, the accuracy of in silico models in ADME increases and thus, many in silico models are now widely used in drug discovery. Because of the demands from drug discovery researchers, the development of in silico models in ADME has become more active. In this chapter, the definitions of ADME endpoints are summarized, and in silico models related to ADME are introduced for each endpoint. Part I discusses the prediction models of the physicochemical properties of compounds, which influence much of the pharmacokinetics of pharmaceuticals. The prediction models of physical properties are developed based mainly on thermodynamics and are knowledge based, especially QSAR (quantitative structure activity relationship) methods. Part II covers the prediction models of the endpoints in ADME which include both in vitro and in vivo assay results. Most models are QSAR based and various kinds of descriptors (topology, 1D, 2D, and 3D descriptors) are used. Part III reviews physiologically based pharmacokinetic (PBPK) models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.