2023
DOI: 10.26434/chemrxiv-2023-dngg4
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Mole-BERT: Rethinking Pre-training Graph Neural Networks for Molecules

Abstract: Recent years have witnessed the prosperity of pre-training graph neural networks (GNNs) for molecules. Typically, atom types as node attributes are randomly masked and GNNs are then trained to predict masked types as in AttrMask \citep{hu2020strategies}, following the Masked Language Modeling (MLM) task of BERT~\citep{devlin2019bert}. However, unlike MLM where the vocabulary is large, the AttrMask pre-training does not learn informative molecular representations due to small and unbalanced atom `vocabulary'. T… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 28 publications
(6 citation statements)
references
References 36 publications
0
3
0
Order By: Relevance
“…We experimentally compared two frequently used protein pre-training models, ProtT5 17 and ESM-2 18 , to extract enzyme sequence features, and two small-molecule pre-training models, MolCLR 19 and Mole-BERT 20 , to extract substrate structural features. Four feature-extraction combinations were generated: ProtT5+Mole-BERT, ProtT5+MolCLR, ESM-2+Mole-BERT, and ESM-2+MolCLR.…”
Section: Mpek Performance With Different Pre-trained Language Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…We experimentally compared two frequently used protein pre-training models, ProtT5 17 and ESM-2 18 , to extract enzyme sequence features, and two small-molecule pre-training models, MolCLR 19 and Mole-BERT 20 , to extract substrate structural features. Four feature-extraction combinations were generated: ProtT5+Mole-BERT, ProtT5+MolCLR, ESM-2+Mole-BERT, and ESM-2+MolCLR.…”
Section: Mpek Performance With Different Pre-trained Language Modelsmentioning
confidence: 99%
“…However, the current predictive models still have much room for improvement due to the limitations of the small number of datasets, insufficient types of enzymes, and incomplete structural characterization. Based on the ongoing development of pre-trained models for proteins and small molecules [17][18][19][20][21] , molecular characterization using large-scale pre-trained models is now feasible, and such models effectively compensate for insufficient molecular characterization and small sample sizes [22][23][24] .…”
Section: Introductionmentioning
confidence: 99%
“…As shown in Figure 4, a 3D molecule contains two types of information: the 3D structure and the atom type. Since atom type modeling is a well-defined problem and can be easily achieved by atomic MLM objective (Xia et al, 2022; Wang et al, 2019), thus, we mainly focus on how to model 3D structure and efficiently address the aforehead mentioned problems in EnCD .…”
Section: D Molecular Representation Learning With Mol-aementioning
confidence: 99%
“…Early approaches to molecular representation learning predominantly focused on 1D SMILES (Wang et al, 2019; Chithrananda et al, 2020; Guo et al, 2021; Honda et al, 2019) and 2D graphs (Li et al, 2021; Lu et al, 2021; Fang et al, 2022b; Xia et al, 2022). Recently, there has been a growing interest in 3D molecular data, which could provide a more comprehensive reflection of physical properties, including information not captured by 1D and 2D data, such as conformation details.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation