2022
DOI: 10.1609/aaai.v36i6.20636
|View full text |Cite
|
Sign up to set email alerts
|

Self-Supervised Pre-training for Protein Embeddings Using Tertiary Structures

Abstract: The protein tertiary structure largely determines its interaction with other molecules. Despite its importance in various structure-related tasks, fully-supervised data are often time-consuming and costly to obtain. Existing pre-training models mostly focus on amino-acid sequences or multiple sequence alignments, while the structural information is not yet exploited. In this paper, we propose a self-supervised pre-training model for learning structure embeddings from protein tertiary structures. Native protein… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(13 citation statements)
references
References 29 publications
0
13
0
Order By: Relevance
“…We adopt the structural representation network proposed in [61] as our teacher network, utilizing the pretrained weights they provided. The teacher network is solely engaged in the process of structural information distillation, extracting representations from structural data, and does not participate in the training or inference processes of downstream tasks.…”
Section: Methodsmentioning
confidence: 99%
“…We adopt the structural representation network proposed in [61] as our teacher network, utilizing the pretrained weights they provided. The teacher network is solely engaged in the process of structural information distillation, extracting representations from structural data, and does not participate in the training or inference processes of downstream tasks.…”
Section: Methodsmentioning
confidence: 99%
“…These edges are typically based on the Cα distances between the residues. GNNs utilize 𝒱for diverse pretraining strategies like contrastive learning (Hermosilla & Ropinski, 2022; Zhang et al, 2023b;a), self-prediction (Yang et al, 2022; Chen et al, 2023) and denoising score matching (Guo et al, 2022; Wu et al, 2022a). Another way inspired by AF2 involves incorporating structure features as contact biases into the attention maps within the self-attention module, e.g., Uni-Mol (Zhou et al, 2023).…”
Section: Related Workmentioning
confidence: 99%
“…However, a simple similarity measure with a pre-set threshold is insufficient to assign high-confident protein function. DL-based PFP methods include function prediction from AA-sequence (Rao et al, 2019;Alley et al, 2019;Elnaggar et al, 2020;Dallago et al, 2021;Kulmanov & Hoehndorf, 2020;Meier et al, 2021;Biswas et al, 2021;Gelman et al, 2021;Yang et al, 2022a), 3-dimensional structure (Gligorijević et al, 2021;Smaili et al, 2021;Guo et al, 2022), evolutionary relationships and genomic context (Rao et al, 2021;Engelhardt et al, 2005), and their combinations Gligorijević et al, 2021). Here, we mainly restrict our scope to the sequence and structure.…”
Section: Related Workmentioning
confidence: 99%