© 2020 American Chemical Society. As a critical issue in drug development and postmarketing safety surveillance, drug-induced liver injury (DILI) leads to failures in clinical trials as well as retractions of on-market approved drugs. Therefore, it is important to identify DILI compounds in the early-stages through in silico and in vivo studies. It is difficult using conventional safety testing methods, since the predictive power of most of the existing frameworks is insufficiently effective to address this pharmacological issue. In our study, we employ a natural language processing (NLP) inspired computational framework using convolutional neural networks and molecular fingerprint-embedded features. Our development set and independent test set have 1597 and 322 compounds, respectively. These samples were collected from previous studies and matched with established chemical databases for structural validity. Our study comes up with an average accuracy of 0.89, Matthews's correlation coefficient (MCC) of 0.80, and an AUC of 0.96. Our results show a significant improvement in the AUC values compared to the recent best model with a boost of 6.67%, from 0.90 to 0.96. Also, based on our findings, molecular fingerprint-embedded featurizer is an effective molecular representation for future biological and biochemical studies besides the application of classic molecular fingerprints.
© 2020 American Chemical Society. As a critical issue in drug development and postmarketing safety surveillance, drug-induced liver injury (DILI) leads to failures in clinical trials as well as retractions of on-market approved drugs. Therefore, it is important to identify DILI compounds in the early-stages through in silico and in vivo studies. It is difficult using conventional safety testing methods, since the predictive power of most of the existing frameworks is insufficiently effective to address this pharmacological issue. In our study, we employ a natural language processing (NLP) inspired computational framework using convolutional neural networks and molecular fingerprint-embedded features. Our development set and independent test set have 1597 and 322 compounds, respectively. These samples were collected from previous studies and matched with established chemical databases for structural validity. Our study comes up with an average accuracy of 0.89, Matthews's correlation coefficient (MCC) of 0.80, and an AUC of 0.96. Our results show a significant improvement in the AUC values compared to the recent best model with a boost of 6.67%, from 0.90 to 0.96. Also, based on our findings, molecular fingerprint-embedded featurizer is an effective molecular representation for future biological and biochemical studies besides the application of classic molecular fingerprints.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.