The identification and discovery of phenotypes from high content screening (HCS) images is a challenging task. Earlier works use image analysis pipelines to extract biological features, supervised training methods or generate features with neural networks pretrained on non-cellular images. We introduce a novel fully unsupervised deep learning algorithm to cluster cellular images with similar Mode-of-Action together using only the images’ pixel intensity values as input. The method outperforms existing approaches on the labelled subset of the BBBC021 dataset and achieves an accuracy of 97.09% for correctly classifying the Mode-of-Action (MOA) by nearest neighbors matching. One unique aspect of the approach is that it is able to perform training on the entire unannotated dataset, to correctly cluster similar treatments beyond the annotated subset of the dataset and can be used for novel MOA discovery.
Cell autonomous cancer dependencies are now routinely identified using CRISPR loss-of-function screens. However, a bias exists that makes it difficult to assess the true essentiality of genes located in amplicons, since the entire amplified region can exhibit lethal scores. These false-positive hits can either be discarded from further analysis, which in cancer models can represent a significant number of hits, or methods can be developed to rescue the true-positives within amplified regions. We propose two methods to rescue true positive hits in amplified regions by correcting for this copy number artefact. The Local Drop Out (LDO) method uses the relative lethality scores within genomic regions to assess true essentiality and does not require additional orthogonal data (e.g. copy number value). LDO is meant to be used in screens covering a dense region of the genome (e.g. a whole chromosome or the whole genome). The General Additive Model (GAM) method models the screening data as a function of the known copy number values and removes the systematic effect from the measured lethality. GAM does not require the same density as LDO, but does require prior knowledge of the copy number values. Both methods have been developed with single sample experiments in mind so that the correction can be applied even in smaller screens. Here we demonstrate the efficacy of both methods at removing the copy number effect and rescuing hits from some of the amplified regions. We estimate a 70-80% decrease of false positive hits in regions of high copy number with either method.
2Head and neck squamous cell carcinoma (HNSCC) is a widely prevalent 6 48 of these cancers and their spread. We additionally report here for the first 49 time, alterations in CSMD1 gene in early premalignant lesions; we further 50 show that this is likely to result in increased ability of the cells to spread 51 and possibly, multiply faster as well. 52 53
Recent development of novel methods based on deep neural networks has transformed how high-content microscopy cellular images are analyzed. Nonetheless, it is still a challenge to identify cellular phenotypic changes caused by chemical or genetic treatments and to elucidate the relationships among treatments in an unsupervised manner, due to the large data volume, high phenotypic complexity and the presence of a priori unknown phenotypes. Here we benchmarked five deep neural network methods and two feature engineering methods on a well-characterized public data set. In contrast to previous benchmarking efforts, the manual annotations were not provided to the methods, but rather used as evaluation criteria afterwards. The seven methods individually performed feature extraction or representation learning from cellular images, and were consistently evaluated for downstream phenotype prediction and clustering tasks. We identified the strengths of individual methods across evaluation metrics, and further examined the biological concepts of features automatically learned by deep neural networks.
No abstract
Introduction: It is important to understand the molecular mechanisms acting in potentially malignant epithelial lesions of the oral mucosa and determine whether the disorder will remain stable, regress or progress to squamous cell carcinoma (OSCC). Some may transform to invasive OSCC independent of developing dysplasia as an intermediate step. Understanding the mechanisms and identifying predictive biomarkers is critical for treatment decisions. Accurate DNA copy number alterations (CNAs) and loss of heterozygosity (LOH) estimates in cancer cells are now detectable using single nucleotide polymorphism (SNP) genotyping microarrays and can be combined with gene expression data. Discovering genomic structural aberrations and their relationship to the expression levels of the genes therein can provide a basis for a deeper understanding of the molecular mechanisms leading to dysplasia and cancer. The main goal of this work is to explore the influence of structural mutations on gene expression in cancer. To do so we are using a dataset of 32 head and neck samples. Immortal carcinoma and dysplasia information were obtained from cell lines, while the mortal carcinoma and dysplasia information were obtained from primary cultures. Methods: The genotypic data were obtained from Illumina 550K Bead arrays and the gene expressions from Affymetrix HGU-133A microarrays. The CNAs and LOHs were determined using a newly developed statistical framework, OncoSNP, which can process complex data derived from heterogeneous samples. The software uses both the log R ratio and the B allele frequency in a Bayesian framework to estimate the genomic aberrations. Results: Firstly the genomic pattern of CNA and LOH were studied, displaying strong structure and identifying a subgroup of immortal samples with high level of amplification across all chromosomes. Secondly the matching gene expressions were analyzed. Finally both genomic and gene expression datasets were combined and genes of high/low gene expression response to underlying copy number changes were searched for. As a result we produced 3 classes of genes that show different behaviour in their expression levels as a response to copy number changes. The 1st group is formed by those genes whose gene expression correlates significantly to the underlying CN changes, we use a Spearman correlation. The 2nd and 3rd gene lists were chosen such that they do not correlate to CN but whose gene expression show high/low coefficient of variation respectively. We then went on to perform pathway analysis to show that these gene sets belong to distinct pathways related to cancer. The non responding group is enriched for genes related to metabolism, while the responding group is enriched for genes belonging to cell signalling pathways. These findings will help to understand how relatively large and variable genomic changes through LOH and CNA can lead to similar disease phenotypes. Citation Format: {Authors}. {Abstract title} [abstract]. In: Proceedings of the 102nd Annual Meeting of the American Association for Cancer Research; 2011 Apr 2-6; Orlando, FL. Philadelphia (PA): AACR; Cancer Res 2011;71(8 Suppl):Abstract nr 38. doi:10.1158/1538-7445.AM2011-38
Systematic perturbation screens provided comprehensive resources for the elucidation of cancer driver genes. However, few algorithms have been developed to robustly interrogate such datasets, particularly with limited number of samples. Here we developed a computational tool called APSiC (Analysis of Perturbation Screens for identifying novel Cancer genes) and applied it to the large-scale deep shRNA screen DRIVE 1 to unveil novel genetic and non-genetic driver genes. APSiC identified both well-known and novel drivers across all cancer types and within individual cancer types. The analysis of individual cancer types revealed that cancer drivers segregate by cell of origin and that genes involved in mRNA splicing may be oncogenic or tumor suppressive depending on the cancer type. We discovered and functionally demonstrated that LRRC4B is a novel putative tumor suppressor gene in breast cancer. The analysis of DRIVE using APSiC is provided as a web portal and represents a valuable resource for the discovery of novel cancer genes. 3 Advances in large-scale functional screening technologies have enabled the discovery of gene requirements across diverse cancer entities 2,3 . Systematic perturbation screens assess how genetic alterations or expression modulation of individual genes lead to phenotypic changes, revealing novel factors in carcinogenesis. McDonald et al. carried out the project DRIVE (deep RNAi interrogation of viability effects in cancer), a large perturbation screen targeting 7,837genes in 398 cancer cell lines across a variety of malignancies to generate a comprehensive atlas of cancer dependencies 1 . In DRIVE 1 , gene dependencies were evaluated using the raw cell viability readout of knockdown/out experiments (Fig. 1a, left) and tested using normality likelihood and correlation tests in a pan-cancer setting.We introduce APSiC (Analysis of Perturbation Screens for identifying novel Cancer genes), a novel tool for the systematic and robust interrogation of large-scale perturbation screens to discover gene dependencies for individual cancers even with limited number of samples.Instead of the raw cell viability readout, we compute a rank profile for each gene by first ranking all genes by their viabilities upon knockdown in a given sample to the range of [0, 1] then aggregating the normalized ranks for a given gene across all samples (Fig. 1a, Online Methods). Thus ranks close to zero represent reduced viability while the ranks close to one indicate cell growth upon knockdown.Incorporating mutation and copy number status of the samples, APSiC identifies potential genetic and non-genetic cancer genes by assessing deviation of the distribution of normalized ranks from what is expected by chance using the Bates and Irwin-Hall tests. The use of the rank-based statistics with the Bates and Irwin-Hall distributions provides enhanced statistical power when the number of cell lines is limited. We consider three classes of genetic drivers (mutation oncogenes, amplification oncogenes, and mutation tumor suppressor...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.