2023
DOI: 10.21203/rs.3.rs-2469268/v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

When Geometric Deep Learning Meets Pretrained Protein Language Models

Abstract: Geometric deep learning has recently achieved great success in non-Euclidean domains, and learning on 3D structures of large biomolecules is emerging as a distinct research area. However, its efficacy is largely constrained due to the limited quantity of structural data. Meanwhile, protein language models trained on substantial 1D sequences have shown burgeoning capabilities with scale in a broad range of applications. Nevertheless, no preceding studies consider combining these different protein modalities to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 25 publications
(25 reference statements)
0
4
0
Order By: Relevance
“…Table 1: Comparison between the predictive performance of different methods in predicting pairwise PPIS of the test complexes of DBD5 and Dockground. Scores for the baseline methods on DBD5 are reported from [10,25,26]…”
Section: Pairwise Ppis Prediction Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Table 1: Comparison between the predictive performance of different methods in predicting pairwise PPIS of the test complexes of DBD5 and Dockground. Scores for the baseline methods on DBD5 are reported from [10,25,26]…”
Section: Pairwise Ppis Prediction Resultsmentioning
confidence: 99%
“…Some recent deep learning-based methods [8,10,25,26] have demonstrated that incorporating information from both the primary amino-acid sequence and the 3D structure of a protein leads to more accurate identification of PPIS. For extracting features from the primary sequence, the recently developed protein language models [27,28], trained on large datasets of 1D amino acid sequences, have been proved to be effective [8,26,29].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Examples of the types of data used as input include the wild-type amino acid sequence ( Lin et al, 2022; Brandes et al, 2022 ), a multiple sequence alignment (MSA) ( Ng and Henikoff, 2001; Balakrishnan et al, 2011; Lui and Tiana, 2013; Nielsen et al, 2017; Hopf et al, 2017; Riesselman et al, 2018; Laine et al, 2019 ) or the protein structure ( Boomsma and Frellsen, 2017; Jing et al, 2021a; Hsu et al, 2022 ). Some methods have combined predictions from multiple protein data types at an aggregate level ( Strokach et al, 2021; Høie et al, 2022; Cagiada et al, 2023; Nguyen and Hy, 2023 ), although some results suggest that a richer representation might be learned by combining multiple data types at the input level ( Mansoor et al, 2021; Wu et al, 2023; Wang et al, 2022; Yang et al, 2022; Chen et al, 2023; Cheng et al, 2023; Zhang et al, 2023 ).…”
Section: Introductionmentioning
confidence: 99%