2022
DOI: 10.48550/arxiv.2210.16484
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Systematic Survey of Molecular Pre-trained Models

Abstract: Obtaining effective molecular representations is at the core of a series of important chemical tasks ranging from property prediction to drug design. So far, deep learning has achieved remarkable success in learning representations for molecules through automated feature learning in a data-driven fashion. However, training deep neural networks from scratch often requires sufficient labeled molecules which are expensive to acquire in real-world scenarios. To alleviate this issue, inspired by the success of the … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 40 publications
0
6
0
Order By: Relevance
“…Considering that 3D geometric information plays a vital role in predicting molecular properties, several recent works Stärk et al, 2021;Fang et al, 2022a;Zhu et al, 2022) pre-train the GNN encoders on molecular datasets with 3D geometric information. We recommend readers refer to a recent survey (Xia et al, 2022f) for more relevant literature. Many above-mentioned works adopt AttrMask (Hu et al, 2020) as a fundamental pre-training sub-task.…”
Section: Pre-training On Moleculesmentioning
confidence: 99%
“…Considering that 3D geometric information plays a vital role in predicting molecular properties, several recent works Stärk et al, 2021;Fang et al, 2022a;Zhu et al, 2022) pre-train the GNN encoders on molecular datasets with 3D geometric information. We recommend readers refer to a recent survey (Xia et al, 2022f) for more relevant literature. Many above-mentioned works adopt AttrMask (Hu et al, 2020) as a fundamental pre-training sub-task.…”
Section: Pre-training On Moleculesmentioning
confidence: 99%
“…Deep learning has been successful in many domain including computer vision (Zhou et al 2020;Wang et al 2022aWang et al , 2023, time series analysis (Xie et al 2022;Meng Liu 2021Liu et al 2022a), bioinformatics (Xia et al 2022b;Gao et al 2022;, and graph data mining (Wang et al 2020(Wang et al , 2021bZeng et al 2022Zeng et al , 2023Wu et al 2022;Duan et al 2022;Yang et al 2022b;Liang et al 2022b). Among these directions, deep graph clustering, which aims to encode nodes with neural networks and divide them into disjoint clusters, has attracted great attention in recent years.…”
Section: Deep Graph Clusteringmentioning
confidence: 99%
“…The pretraining dataset primarily consists of unlabeled molecular data from extensive public databases like ChEMBL, PubChem, and ZINC . Popular pretraining strategies fall under self-supervised learning (SSL), including masked component modeling, context prediction, replaced component detection, and contrastive learning. , SSL methods start with the molecular structure itself to reveal inherent patterns. Given the close link between molecular structures and physicochemical properties, these methods play a crucial role in predicting molecular properties. , They are frequently employed to establish a versatile pretraining model for various downstream tasks.…”
Section: Introductionmentioning
confidence: 99%