2023
DOI: 10.1093/nar/gkad288
|View full text |Cite
|
Sign up to set email alerts
|

GeoBind: segmentation of nucleic acid binding interface on protein surface with geometric deep learning

Abstract: Unveiling the nucleic acid binding sites of a protein helps reveal its regulatory functions in vivo. Current methods encode protein sites from the handcrafted features of their local neighbors and recognize them via a classification, which are limited in expressive ability. Here, we present GeoBind, a geometric deep learning method for predicting nucleic binding sites on protein surface in a segmentation manner. GeoBind takes the whole point clouds of protein surface as input and learns the high-level represen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(16 citation statements)
references
References 47 publications
0
9
0
Order By: Relevance
“…Second, unlike methods that only explore the C α models of proteins 25,40 , GPSite exploits a comprehensive geometric featurizer to fully refine knowledge in the backbone and sidechain atoms. Third, the employed message propagation on residue graphs is global structure-aware and time-efficient compared to the methods based on surface point clouds 21,22 , and memory-efficient unlike methods based on full atom graphs 23,24 . Last but not least, instead of predicting binding sites for a single molecule type or learning binding patterns separately for different molecules, GPSite applies multi-task learning to better model the latent relationships among different binding partners.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Second, unlike methods that only explore the C α models of proteins 25,40 , GPSite exploits a comprehensive geometric featurizer to fully refine knowledge in the backbone and sidechain atoms. Third, the employed message propagation on residue graphs is global structure-aware and time-efficient compared to the methods based on surface point clouds 21,22 , and memory-efficient unlike methods based on full atom graphs 23,24 . Last but not least, instead of predicting binding sites for a single molecule type or learning binding patterns separately for different molecules, GPSite applies multi-task learning to better model the latent relationships among different binding partners.…”
Section: Resultsmentioning
confidence: 99%
“…To demonstrate the effectiveness of our method, we compared GPSite with 8 sequence-based predictors including DRNApred 12 , NCBRPred 43 , SVMnuc 44 , PepBind 14 , PepNN-Seq 45 , PepBCL 17 , TargetS 15 , and LMetalSite 18 , as well as 15 structure-based predictors including NucBind 44 , COACH-D 46 , GraphBind 25 , GeoBind 22 , aaRNA 47 , PepNN-Struct 45 , DeepPPISP 48 , SPPIDER 49 , MaSIF-site 21 , GraphPPIS 26 , ScanNet 23 , PeSTo 24 , DELIA 19 , MIB 50 , and IonCom 51 (see Appendix 1-’Brief introductions to the competitive methods’ for more details). For most competitive methods, we used their webservers or standalone programs for evaluation.…”
Section: Resultsmentioning
confidence: 99%
“…In fact, the ESM2-generated feature embedding information has been successfully employed to lots of bioinformatics research fields, such as B-cell epitope identification, 31 antihypertensive peptide screening, 32 and protein-nucleic binding site prediction. 33 The ESM2 model is trained by a deep transformer neural network module on millions of diverse amino acid sequences for simulating protein language patterns, allowing us to extract information-rich feature vectors, which are robust to sequence diversity while being trained on a relatively small data set. 34 There are several pretrained models of ESM2 (see https://github.com/ facebookresearch/esm).…”
Section: ■ Materials and Methodsmentioning
confidence: 99%
“…In this study, we employ ESM2, a pretrained protein language model introduced previously by Facebook to help us to dig out fast and automatically extract the high-latent discriminative representations of protein sequences relevant for protein functions. In fact, the ESM2-generated feature embedding information has been successfully employed to lots of bioinformatics research fields, such as B-cell epitope identification, antihypertensive peptide screening, and protein-nucleic binding site prediction . The ESM2 model is trained by a deep transformer neural network module on millions of diverse amino acid sequences for simulating protein language patterns, allowing us to extract information-rich feature vectors, which are robust to sequence diversity while being trained on a relatively small data set .…”
Section: Methodsmentioning
confidence: 99%
“…Their method has been applied to protein-ligand, protein-NA, protein-ion and protein-lipid binding. GeoBind 21 utilizes quasi-geodesic convolutions over point clouds for DNA and RNA binding site prediction with features closely based on dMaSIF 22 . Other recent methods which do not utilize GNNs but are closely related include MaSIF 23 which predicts protein-protein binding interfaces and small-ligand binding sites and PST-PRNA 24 which predicts RNA bind sites.…”
Section: Introductionmentioning
confidence: 99%