2022
DOI: 10.48550/arxiv.2210.01765
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

One Transformer Can Understand Both 2D & 3D Molecular Data

Abstract: Unlike vision and language data which usually has a unique format, molecules can naturally be characterized using different chemical formulations. One can view a molecule as a 2D graph or define it as a collection of atoms located in a 3D space. For molecular representation learning, most previous works designed neural networks only for a particular data format, making the learned models likely to fail for other data formats. We believe a general-purpose neural network model for chemistry should be able to han… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
26
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(26 citation statements)
references
References 37 publications
0
26
0
Order By: Relevance
“…Nevertheless, inspiration can be drawn from molecule pretraining frameworks that encode 3D information even when only molecular graphs are available. Broadly speaking, these methods , align embeddings or minimize property predictions between 3D models and graph-based models, the latter of which do not necessitate 3D coordinates as input. In the realm of chemical reactions, data sets incorporating simulated reactive processes have been developed .…”
Section: Discussionmentioning
confidence: 99%
“…Nevertheless, inspiration can be drawn from molecule pretraining frameworks that encode 3D information even when only molecular graphs are available. Broadly speaking, these methods , align embeddings or minimize property predictions between 3D models and graph-based models, the latter of which do not necessitate 3D coordinates as input. In the realm of chemical reactions, data sets incorporating simulated reactive processes have been developed .…”
Section: Discussionmentioning
confidence: 99%
“…We compare our FLT with the regular Performer without RPE. For the FLT model, we consider to approximate RPE-masks based on Gaussian basis functions, which are popularly used in neural networks for molecular modeling (Gasteiger et al, 2021;Shi et al, 2022;Luo et al, 2022a). Specifically, the RPE-mask is defined as N = [f (r i − r j )] i,j=1,...,L ∈ R L×L , where r i ∈ R 3 is the position of the i-th input atom, L is the total number of input atom, and…”
Section: Compared Methodsmentioning
confidence: 99%
“…GEM [11] proposes a bond-angle graph and self-supervised tasks which use large-scale unlabelled molecules with coarse 3D spatial structures which can be calculated by cheminformatics tools such as RDKit. Inspired by the encoding method of Graphormer for 2D molecular graph, Transformer-M [29] further encodes 3D spatial distance into attention bias, which can take molecular data of 2D or 3D formats as input.…”
Section: Pre-training On Molecular Graphmentioning
confidence: 99%
“…Baselines. Because the label of test datasets is officially hidden, we compared the results of validation with the top-tier method on the OGB leaderboard 2 , which includes GRPE [34], TokenGT [20], EGT [18] , GPS [40], GEM-2 [26], Vis-Net [52] and Transformer-M [29]. In addition, we also compared GPS++ [30], the winner solution at OGB LSC @ NeruIPS 2022 Challenge.…”
Section: Pre-training On Large-scale Datasetmentioning
confidence: 99%