2021
DOI: 10.1101/2021.07.29.454377
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Virtual screening for small molecule pathway regulators by image profile matching

Abstract: Identifying chemical regulators of biological pathways is currently a time-consuming bottleneck in developing therapeutics and small-molecule research tools. Typically, thousands to millions of candidate small molecules are tested in target-based biochemical screens or phenotypic cell-based screens, both expensive experiments customized to a disease of interest. Here, we instead use a broad, virtual screening approach that matches compounds to pathways based on phenotypic information in public data. Our comput… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
5
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1

Relationship

4
1

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 73 publications
0
5
0
Order By: Relevance
“…We recently became interested in creating a very large public Cell Painting data set due to the success of multiple approaches where image data can be used to predict not just particular phenotypes about the parts of the cell that were stained, but entirely orthogonal data such as gene expression data or the outcome of a biochemical assay 10,11 . Such a database could also allow queries for compounds that match genes 12 and vice versa. It stands to reason that a large, well constructed set of image-based phenotypic screen data could not only inspire new computational tools to mine such data, but also serve as a resource for researchers to compare their own data against, accelerating discovery for thousands of scientists globally.…”
Section: Introductionmentioning
confidence: 99%
“…We recently became interested in creating a very large public Cell Painting data set due to the success of multiple approaches where image data can be used to predict not just particular phenotypes about the parts of the cell that were stained, but entirely orthogonal data such as gene expression data or the outcome of a biochemical assay 10,11 . Such a database could also allow queries for compounds that match genes 12 and vice versa. It stands to reason that a large, well constructed set of image-based phenotypic screen data could not only inspire new computational tools to mine such data, but also serve as a resource for researchers to compare their own data against, accelerating discovery for thousands of scientists globally.…”
Section: Introductionmentioning
confidence: 99%
“…One promising method to predict mechanisms of action is to collect a profile from cells and attempt to match it to a library of profiles gathered from other chemical perturbations: a match, or close similarity, can be helpful if the compound the query matches is already well-known. Likewise, a match to a genetic perturbation means that the gene, or another gene in the same pathway, is a possible target of the query compound 23 .…”
Section: Resultsmentioning
confidence: 99%
“…Our hope is that novel ML methods developed using our dataset will be used to discover new gene-compound connections (Rohban et al, 2021). This can yield new therapeutics for particular diseases, or identify how a potential drug is working and thus add to ground truth for this problem in the future.…”
Section: Introductionmentioning
confidence: 99%
“…Most compounds are thought to inhibit the function of their target gene’s product (as opposed to making it overly active), so we expect image-based profiles from cells treated with CRISPR to generally correlate to (mimic) the corresponding compound’s profile, whereas ORF profiles are generally expected to anti-correlate (oppose) the corresponding small molecule’s profile, and ORFs and CRISPRs targeting the same gene should generally yield opposite (anti-correlated) effects on the cells’ profiles. However, we strongly note that there will be numerous exceptions given the non-linear behavior of many biological systems and a number of distinct mechanisms by which these general principles may not hold (Rohban et al, 2021). In fact, one aim of generating this dataset is to quantify how often the expected relationships and directionalities occur.…”
Section: Introductionmentioning
confidence: 99%