We build a general and highly applicable clustering theory, which we call cross-entropy clustering (shortly CEC) which joins advantages of classical kmeans (easy implementation and speed) with those of EM (affine invariance and ability to adapt to clusters of desired shapes). Moreover, contrary to k-means and EM, CEC finds the optimal number of clusters by automatically removing groups which carry no information.Although CEC, similarly like EM, can be build on an arbitrary family of densities, in the most important case of Gaussian CEC the division into clusters is affine invariant, while the numerical complexity is comparable to that of k-means.
Docking is one of the most important steps in virtual screening pipelines, and it is an established method for examining potential interactions between ligands and receptors. However, this method is computationally expensive, and it is often among the last steps of the process of compound libraries evaluation. In this work, we investigate the feasibility of learning a deep neural network to predict the docking output directly from a two-dimensional compound structure. The developed protocol is orders of magnitude faster than typical docking software, and it returns ligand−receptor complexes encoded in the form of the interaction fingerprint. Its speed and efficiency unlock the application possibilities, such as screening compound libraries of vast size on the basis of contact patterns or docking score (derived on the basis of predicted interaction schemes). We tested our approach on several G protein-coupled receptor targets and 4 CYP enzymes in retrospective virtual screening experiments, and a variant of graph convolutional network appeared to be most effective in emulating docking results. The method can be easily used by the community based on the code available in the Supporting Information.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.