2020
DOI: 10.1021/acs.jmedchem.0c00445
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing Chemogenomics with Predictive Pharmacology

Abstract: One of the grand challenges in contemporary chemical biology is the generation of a probe for every member of the human proteome. Probe selection and optimization strategies typically rely on experimental bioactivity data to determine the potency and selectivity of candidate molecules. However, this approach is profoundly limited by the sparsity of the known data, the annotation bias often found in the literature, and the cost of physical screening. Recent advancements in predictive pharmacology, such as the a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 74 publications
0
3
0
Order By: Relevance
“…First, as with most bioactivity data, the set is extremely sparse as only 0.67% of the activity data matrix is available of the total set consisting of 1.25 million compounds and almost 7000 proteins. Data sparsity has been shown to be of importance in the context of selectivity prediction [ 49 ] and though several groups have attempted to optimize modelling on these sparse matrices, it remains a challenge. One possible alleviation is the use of active learning to identify information-rich data points that are missing and experimentally determine them [ 50 ].…”
Section: Resultsmentioning
confidence: 99%
“…First, as with most bioactivity data, the set is extremely sparse as only 0.67% of the activity data matrix is available of the total set consisting of 1.25 million compounds and almost 7000 proteins. Data sparsity has been shown to be of importance in the context of selectivity prediction [ 49 ] and though several groups have attempted to optimize modelling on these sparse matrices, it remains a challenge. One possible alleviation is the use of active learning to identify information-rich data points that are missing and experimentally determine them [ 50 ].…”
Section: Resultsmentioning
confidence: 99%
“…Publicly available chemogenomic databases are very far from complete, and therefore, ML modelling approaches can be used to provide estimates for missing data. DL-based models have shown promise in this context ( Gaudelet et al, 2021 ; James et al, 2020 ). We used DeepPurpose, a DL library for DTI prediction ( Huang et al, 2021 ) that takes as an input SMILES of the small molecules of interest and the amino acid sequences of the protein-coding genome.…”
Section: Methodsmentioning
confidence: 99%
“…A final score was obtained by an average ranking of each protein across 14 models, with the final top-ranking targets predicted to be the most likely protein targets of the input drug list. Comparable consensus-oriented strategies are often applied in virtual screening to exploit the strengths of multiple models ( Gaudelet et al, 2021 ; James et al, 2020 ) and achieve improved performance ( Palacio-Rodríguez et al, 2019 ; Perez-Castillo et al, 2017 ). DeepPurpose models showed promising performance in various testing scenarios, and we refer to the original publication for further details.…”
Section: Methodsmentioning
confidence: 99%