or will not assign the molecule at all if reference compounds with similar phenomic profiles are not present in the experiment. For poly-pharmacological compounds, morphological and transcriptional readouts reflect the actions on multiple, functionally unrelated targets that are convolved into a complex phenotype, and their activity on specific and relevant targets must therefore be deconvolved from these phenomic profiles, which is far from trivial. Deconvolution can be achieved using supervised machine learning models to predict target activity 7. This approach requires that many of the phenomically profiled compounds have also been tested in the targetor MoA-specific assays it aims to cover. These compounds are used as training examples to identify the most informative phenomic features to predict activities in each assay of interest. Consequently, the data scale required for supervised deconvolution can be prohibitive. A continuing challenge in drug discovery is the design of an affordable set of phenomic profiling assays with maximal coverage of therapeutic targets. Ideally, this assay set should be able to document a phenotypic response towards at least the vast majority of the 549 human proteins targeted by 999 FDA-approved small-molecule drugs for human disease 12. Until recently, most studies that use phenomic profiling have been restricted in scope, typically using fewer than 100 reference compounds annotated to a limited number of targets 1,2,4,13-15 and are not sufficiently broadly representative to inform design principles for generic sets of phenomic profiling assays. In the current study, we used a nearest-reference approach to explore how various screening parameters affect the ability to distinguish MoAs from each other. We used gene-editing to create a panel of 15 reporter cell lines by introducing different combinations of 12 blue fluorescent protein (BFP), green fluorescent protein (GFP), or red fluorescent protein (RFP)/FusionRed signalling pathway and organelle markers into the A549, HepG2, and WPMY1 cell backgrounds, and profiled these reporter cell lines in live-cell, high-content imaging screens against a library of 1,008 small molecules, manually annotated with 218 unique MoA descriptors, at four concentrations. Results Library of 1,008 reference compounds and 169 natural products. We assembled a set of 1,008 well-characterized reference compounds, composed of FDA-approved drugs and commercially available tool compounds, and manually annotated each compound with one or more MoA descriptors using publicly available information from vendor compound catalogues, chemical databases, and large-scale target annotation projects 12,16-18 (Supplementary Table S1). In total, 218 unique MoA descriptors were assigned to the reference compound set. Of the 1,008 reference compounds, 829 (82%) were labelled with only a single MoA descriptor. Of the 218 MoAs, 132 (61%) were assigned to â„ 3 co-annotated compounds and 92 (42%) were assigned to â„ 5 co-annotated compounds (Supplementary Fig. S1). To increase...