Array-based comparative genomic hybridization (aCGH) is a recently developed tool for genome-wide determination of DNA copy number alterations. This technology has tremendous potential for disease-gene discovery in cancer and developmental disorders as well as numerous other applications. However, widespread utilization of a CGH has been limited by the lack of well characterized, high-resolution clone sets optimized for consistent performance in aCGH assays and specifically designed analytic software. We have assembled a set of approximately 4100 publicly available human bacterial artificial chromosome (BAC) clones evenly spaced at approximately 1-Mb resolution across the genome, which includes direct coverage of approximately 400 known cancer genes. This aCGH-optimized clone set was compiled from five existing sets, experimentally refined, and supplemented for higher resolution and enhancing mapping capabilities. This clone set is associated with a public online resource containing detailed clone mapping data, protocols for the construction and use of arrays, and a suite of analytical software tools designed specifically for aCGH analysis. These resources should greatly facilitate the use of aCGH in gene discovery.
Gene-expression data are often used to infer pathways regulating transcriptional responses. For example, differentially expressed genes (DEGs) induced by compound treatment can help characterize hits from phenotypic screens, either by correlation with known drug signatures or by pathway enrichment. Pathway enrichment is, however, typically computed with DEGs rather than "upstream" nodes that are potentially causal of "downstream" changes. Here, we present graphbased models to predict causal targets from compound-microarray data. We test several approaches to traversing network topology, and show that a consensus minimum-rank score (SigNet) beat individual methods and could highly rank compound targets among all network nodes. In addition, larger, less canonical networks outperformed linear canonical interactions. Importantly, pathway enrichment using causal nodes rather than DEGs recovers relevant pathways more often. To further validate our approach, we used integrated data sets from the Cancer Genome Atlas to identify driving pathways in triple-negative breast cancer. Critical pathways were uncovered, including the epidermal growth factor receptor 2-phosphatidylinositide 3-kinase-AKT-MAPK growth pathway and ATR-p53-BRCA DNA damage pathway, in addition to unexpected pathways, such as TGF-WNT cytoskeleton remodeling, IL12-induced interferon gamma production, and TNFR-IAP (inhibitor of apoptosis) apoptosis; the latter was validated by pooled small hairpin RNA profiling in cancer cells. Overall, our approach can bridge transcriptional profiles to compound targets and driving pathways in cancer.
The elucidation of drug targets is important both to optimize desired compound action and to understand drug side-effects. In this study, we created statistical models which link chemical substructures of ligands to protein domains in a probabilistic manner and employ the model to triage the results of affinity chromatography experiments. By annotating targets with their InterPro domains, general rules of ligand-protein domain associations were derived and successfully employed to predict protein targets outside the scope of the training set. This methodology was then tested on a proteomics affinity chromatography data set containing 699 compounds. The domain prediction model correctly detected 31.6% of the experimental targets at a specificity of 46.8%. This is striking since 86% of the predicted targets are not part of them (but share InterPro domains with them), and thus could not have been predicted by conventional target prediction approaches. Target predictions improve drastically when significance (FDR) scores for target pulldowns are employed, emphasizing their importance for eliminating artifacts. Filament proteins (such as actin and tubulin) are detected to be 'frequent hitters' in proteomics experiments and their presence in pulldowns is not supported by the target predictions. On the other hand, membrane-bound receptors such as serotonin and dopamine receptors are noticeably absent in the affinity chromatography sets, although their presence would be expected from the predicted targets of compounds. While this can partly be explained by the experimental setup, we suggest the computational methods employed here as a complementary step of identifying protein targets of small molecules. Affinity chromatography results for gefitinib are discussed in detail and while two out of the three kinases with the highest affinity to gefitinib in biochemical assays are detected by affinity chromatography, also the possible involvement of NSF as a target for modulating cancer progressions via beta-arrestin can be proposed by this method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.