Enzymes suffer from high cost, complex purification, and low stability. Development of low‐cost artificial enzymes of comparative or higher effectiveness is desired. Given its complexity, it is desired to presume their activities prior to experiments. While computational approaches demonstrate success in modeling nanozyme activities, they require assumptions about the system to be made. Machine learning (ML) is an alternative approach towards data‐driven material property prediction achieving high performance even on multicomponent complex systems. Despite the growing demand for customized nanozymes, there is no open access nanozyme database. Here, a user‐friendly expandable database of >300 existing inorganic nanozymes is developed by data collection from >100 articles. Data analysis is performed to reveal the features responsible for catalytic activities of nanozymes, and new descriptors are proposed for its ML‐assisted prediction. A random forest regression (RFR) model for evaluation of nanozyme peroxidase activity is developed and optimized by correlation‐based feature selection and hyperparameter tuning, achieving performance up to R2 = 0.796 for Kcat and R2 = 0.627 for Km. Experiment‐confirmed unknown nanozyme activity prediction is also demonstrated. Moreover, the DiZyme expandable, open‐access resource containing the database, predictive algorithm, and visualization tool is developed to boost novel nanozyme discovery worldwide (https://dizyme.net).
Organic chemistry has seen colossal progress due to machine learning (ML). However, the translation of artificial intelligence (AI) into materials science is challenging, where biological behavior prediction becomes even more complicated. Nanotoxicity is a critical parameter that describes their interaction with the living organisms screened in every bio‐related research. To prevent excessive experiments, such properties have to be pre‐evaluated. Several existing ML models partially fulfill the gap by predicting whether a nanomaterial is toxic or not. Yet, this binary categorization neglects the concentration dependencies crucial for experimental scientists. Here, an ML‐based approach is proposed to the quantitative prediction of inorganic nanomaterial cytotoxicity achieving the precision expressed by 10‐fold cross‐validation (CV) Q2 = 0.86 with the root mean squared error (RMSE) of 12.2% obtained by the correlation‐based feature selection and grid search‐based model hyperparameters optimization. To provide further model flexibility, quantitative atom property‐based nanomaterial descriptors are introduced allowing the model to extrapolate on unseen samples. Feature importance is calculated to find an interpretable model with optimal decision‐making. These findings allow experimental scientists to perform primary in silico candidate screening and minimize the number of excessive, labor‐intensive experiments enabling the rapid development of nanomaterials for medicinal purposes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.