Convolutional neural networks (CNN) have been successfully used to handle three-dimensional data and are a natural match for data with spatial structure such as 3D molecular structures. However, a direct 3D representation of a molecule with atoms localized at voxels is too sparse, which leads to poor performance of the CNNs. In this work, we present a novel approach where atoms are extended to fill other nearby voxels with a transformation based on the wave transform. Experimenting on 4.5 million molecules from the Zinc database, we show that our proposed representation leads to better performance of CNN-based autoencoders than either the voxel-based representation or the previously used Gaussian blur of atoms and then successfully apply the new representation to classification tasks such as MACCS fingerprint prediction.
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning. Biological neural networks are known to solve this problem quite well in unsupervised manner, yet unsupervised artificial neural networks either struggle to do it or require fine tuning for each task individually. We associate this with the fact that a biological brain learns in the context of the relationships between observations, while an artificial network does not. We also notice that, though a naive data augmentation technique can be very useful for supervised learning problems, autoencoders typically fail to generalize transformations from data augmentations. Thus, we believe that providing additional knowledge about relationships between data samples will improve model's capability of finding useful inner data representation. More formally, we consider a dataset not as a manifold, but as a category, where the examples are objects. Two these objects are connected by a morphism, if they actually represent different transformations of the same entity. Following this formalism, we propose a novel method of using data augmentations when training autoencoders. We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network in terms of the hidden representation. We believe that the classification accuracy of a linear classifier on the learned representation is a good metric to measure its interpretability. In our experiments, present approach outperforms β-VAE and is comparable with Gaussian-mixture VAE.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.