The lack of explainability is one of the most prominent disadvantages of deep learning applications in omics. This ‘black box’ problem can undermine the credibility and limit the practical implementation of biomedical deep learning models. Here we present XOmiVAE, a variational autoencoder (VAE)-based interpretable deep learning model for cancer classification using high-dimensional omics data. XOmiVAE is capable of revealing the contribution of each gene and latent dimension for each classification prediction and the correlation between each gene and each latent dimension. It is also demonstrated that XOmiVAE can explain not only the supervised classification but also the unsupervised clustering results from the deep learning network. To the best of our knowledge, XOmiVAE is one of the first activation level-based interpretable deep learning models explaining novel clusters generated by VAE. The explainable results generated by XOmiVAE were validated by both the performance of downstream tasks and the biomedical knowledge. In our experiments, XOmiVAE explanations of deep learning-based cancer classification and clustering aligned with current domain knowledge including biological annotation and academic literature, which shows great potential for novel biomedical knowledge discovery from deep learning models.
The epithelial to mesenchymal transition (EMT) is a key cellular process underlying cancer progression, with multiple intermediate states whose molecular hallmarks remain poorly characterised. To fill this gap, we present a method to robustly evaluate EMT transformation in individual tumours based on transcriptomic signals. We apply this approach to explore EMT trajectories in 7180 tumours of epithelial origin and identify three macro-states with prognostic and therapeutic value, attributable to epithelial, hybrid E/M and mesenchymal phenotypes. We show that the hybrid state is relatively stable and linked with increased aneuploidy. We further employ spatial transcriptomics and single cell datasets to explore the spatial heterogeneity of EMT transformation and distinct interaction patterns with cytotoxic, NK cells and fibroblasts in the tumour microenvironment. Additionally, we provide a catalogue of genomic events underlying distinct evolutionary constraints on EMT transformation. This study sheds light on the aetiology of distinct stages along the EMT trajectory, and highlights broader genomic and environmental hallmarks shaping the mesenchymal transformation of primary tumours.
Tumour immunity is key for the prognosis and treatment of colon adenocarcinoma, but its characterisation remains cumbersome and expensive, requiring sequencing or other complex assays. Detecting tumour-infiltrating lymphocytes in haematoxylin and eosin (H&E) slides of cancer tissue would provide a cost-effective alternative to support clinicians in treatment decisions, but inter- and intra-observer variability can arise even amongst experienced pathologists. Furthermore, the compounded effect of other cells in the tumour microenvironment is challenging to quantify but could yield useful additional biomarkers. We combined RNA sequencing, digital pathology and deep learning through the InceptionV3 architecture to develop a fully automated computer vision model that detects prognostic tumour immunity levels in H&E slides of colon adenocarcinoma with an area under the curve (AUC) of 82%. Amongst tumour infiltrating T cell subsets, we demonstrate that CD8+ effector memory T cell patterns are most recognisable algorithmically with an average AUC of 83%. We subsequently applied nuclear segmentation and classification via HoVer-Net to derive complex cell-cell interaction graphs, which we queried efficiently through a bespoke Neo4J graph database. This uncovered stromal barriers and lymphocyte triplets that could act as structural hallmarks of low immunity tumours with poor prognosis. Our integrated deep learning and graph-based workflow provides evidence for the feasibility of automated detection of complex immune cytotoxicity patterns within H&E-stained colon cancer slides, which could inform new cellular biomarkers and support treatment management of this disease in the future.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.