2021
DOI: 10.1021/acs.jcim.0c01285
|View full text |Cite
|
Sign up to set email alerts
|

MSA-Regularized Protein Sequence Transformer toward Predicting Genome-Wide Chemical-Protein Interactions: Application to GPCRome Deorphanization

Abstract: Small molecules play a critical role in modulating biological systems. Knowledge of chemical–protein interactions helps address fundamental and practical questions in biology and medicine. However, with the rapid emergence of newly sequenced genes, the endogenous or surrogate ligands of a vast number of proteins remain unknown. Homology modeling and machine learning are two major methods for assigning new ligands to a protein but mostly fail when sequence homology between an unannotated protein and those with … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
60
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 23 publications
(60 citation statements)
references
References 46 publications
0
60
0
Order By: Relevance
“…When compared with the state-of-the-art method DISAE [1], which already was shown to outperform other leading methods for predicting CPIs of orphan receptors, PortalCG demonstrates superior performance in terms of both Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves, as shown in Figure 4(a). Because the ratio of positive and negative cases is imbalanced, the PR curve is more informative than the ROC curve.…”
Section: Portal Learning Significantly Outperforms State-of-the-art Approaches To Predicting Dark Cpismentioning
confidence: 98%
See 3 more Smart Citations
“…When compared with the state-of-the-art method DISAE [1], which already was shown to outperform other leading methods for predicting CPIs of orphan receptors, PortalCG demonstrates superior performance in terms of both Receiver Operating Characteristic (ROC) and Precision-Recall (PR) curves, as shown in Figure 4(a). Because the ratio of positive and negative cases is imbalanced, the PR curve is more informative than the ROC curve.…”
Section: Portal Learning Significantly Outperforms State-of-the-art Approaches To Predicting Dark Cpismentioning
confidence: 98%
“…In PortalCG, protein structure information is used as a portal to connect a source protein sequence and a corresponding target protein function (Figure 1A). We begin by performing self-supervised training to map tens of millions of sequences into a universal embedding space, using our recent distilled sequence alignment embedding (DISAE) algorithm [1]. Then, 3D structural information about the ligand-binding site is used to fine-tune the sequence embedding.…”
Section: Overview Of Portalcgmentioning
confidence: 99%
See 2 more Smart Citations
“…To model interdependency, MolTrans [ 24 ] leverages transformers built on frequent consecutive protein subsequences and SMILES notations to construct interaction maps for DTI pairs. DISAE [ 25 ] utilizes evolutionarily distilled sequence representations as inputs to ALBERT [ 26 ]. Some attention-based models train physical interactions between the substructures of ligands and binding sites of proteins to give better performance and interpretability.…”
Section: Introductionmentioning
confidence: 99%