Motivation
Self-supervised learning (SSL) is a method that learns the data representation by utilizing supervision inherent in the data. This learning method is in the spotlight in the drug field, lacking annotated data due to time-consuming and expensive experiments. SSL using enormous unlabeled data has shown excellent performance for molecular property prediction, but a few issues exist. (1) Existing SSL models are large-scale; there is a limitation to implementing SSL where the computing resource is insufficient. (2) In most cases, they do not utilize 3D structural information for molecular representation learning. The activity of a drug is closely related to the structure of the drug molecule. Nevertheless, most current models do not use 3D information or use it partially. (3) Previous models that apply contrastive learning to molecules use the augmentation of permuting atoms and bonds. Therefore, molecules having different characteristics can be in the same positive samples. We propose a novel contrastive learning framework, small-scale 3D Graph Contrastive Learning (3DGCL) for molecular property prediction, to solve the above problems.
Results
3DGCL learns the molecular representation by reflecting the molecule’s structure through the pre-training process that does not change the semantics of the drug. Using only 1,128 samples for pre-train data and 0.5 million model parameters, we achieved state-of-the-art or comparable performance in six benchmark datasets. Extensive experiments demonstrate that 3D structural information based on chemical knowledge is essential to molecular representation learning for property prediction.
Availability
Data and codes are available in https://github.com/moonkisung/3DGCL.
Background: Low-dose computed tomography (LDCT) has improved the early detection of lung cancer.However, LDCT scans present several disadvantages, including the abundance of false-positive results, which lead to a high socioeconomic cost, psychological burden, and repeated exposure to radiation. Therefore, the identification of complementary biomarkers is needed to select high-risk individuals for LDCT. Here, we showed that granzyme B testing with the novel immunosensor has diagnostic value for identifying patients with lung cancer.
Methods:We enrolled 44 patients with lung cancer and 51 health controls at Pusan National University Yangsan Hospital in Korea between March 2018 and September 2019. The immunosensor analyzed serum granzyme B levels, and their association with lung cancer detection was evaluated with machine learning models.Results: Serum granzyme B levels were assessed in samples from patients with lung cancer and healthy individuals. Granzyme B testing showed 100% sensitivity, 80% specificity, and an area under the curve of 0.938 for lung cancer detection. After combining granzyme B testing with clinical predictors such as age, smoking status, or pack-years, results from the five-fold cross-validation with random forest model improved diagnostic accuracy of 92.1%, with a sensitivity, specificity, and area under the curve of 92.0%, 92.1%, and 0.977, respectively.Conclusions: This feasibility study suggested that granzyme B may be utilized to detect lung cancer.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.