Motivation
The acid dissociation constant (pKa) is a critical parameter to reflect the ionization ability of chemical compounds and is widely applied in a variety of industries. However, the experimental determination of pKa is intricate and time-consuming, especially for the exact determination of micro pKa information at the atomic level. Hence, a fast and accurate prediction of pKa values of chemical compounds is of broad interest.
Results
Here, we compiled a large scale pKa dataset containing 16595 compounds with 17489 pKa values. Based on this dataset, a novel pK a prediction model, named Graph-pKa, was established using graph neural networks. Graph-pKa performed well on the prediction of macro pK a values, with a mean absolute error around 0.55 and a coefficient of determination around 0.92 on the test dataset. Furthermore, combining multi-instance learning, Graph-pKa was also able to automatically deconvolute the predicted macro pKa into discrete micro pK a values.
Availability
The Graph-pK a model is now freely accessible via a web-based interface (https://pka.simm.ac.cn/).
Supplementary information
Supplementary data are available at Bioinformatics online.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.