Recent developments in spatial transcriptomics (ST) technologies have enabled the profiling of transcriptome-wide gene expression while retaining the location information of measured genes within tissues. Moreover, the corresponding high-resolution hematoxylin and eosin-stained histology images are readily available for the ST tissue sections. Since histology images are easy to obtain, it is desirable to leverage information learned from ST to predict gene expression for tissue sections where only histology images are available. Here we present HisToGene, a deep learning model for gene expression prediction from histology images. To account for the spatial dependency of measured spots, HisToGene adopts Vision Transformer, a state-of-the-art method for image recognition. The well-trained HisToGene model can also predict super-resolution gene expression. Through evaluations on 32 HER2+ breast cancer samples with 9,612 spots and 785 genes, we show that HisToGene accurately predicts gene expression and outperforms ST-Net both in gene expression prediction and clustering tissue regions using the predicted expression. We further show that the predicted super-resolution gene expression also leads to higher clustering accuracy than observed gene expression. Gene expression predicted from HisToGene enables researchers to generate virtual transcriptomics data at scale and can help elucidate the molecular signatures of tissues.
Recent progress in machine learning provides competitive methods for bioinformatics in many traditional topics, such as transcriptomes sequence and single-cell analysis. However, discovering biomedical correlation of cells that are present across large-scale data sets remains challenging. Our attention-based neural network module with 300 million parameters is able to capture biological knowledge in a data-driven way. The module contains high-quality embedding, taxonomy analysis and similarity measurement. We tested the model on Mouse Brain Atlas, which consists of 160,000 cells and 25,000 genes. Our module obtained some interesting findings that have been verified by biologists and got better performance when benchmarked against autoencoder and principal components analysis.
Recent developments in spatially resolved transcriptomics (SRT) technologies have enabled the profiling of transcriptome-wide gene expression while retaining spatial location information of each measured spot within a tissue. Meanwhile, the corresponding histopathology images of tissue sections are readily available and can be aligned to the measured spots. Given that the histology images are practically more convenient and affordable to obtain, we designed HOPE2Net, a multi-layer perceptron architecture, that leverages information provided by SRT data to predict gene expression and pathway activities from histology images. Through systematic evaluations of different approaches for extracting deep image features and cellular morphology features, HOPE2Net performs feature selections from a pre-trained Vision Transformer, which is the state-of-art deep learning model for image recognition. After extracting histological image features, HOPE2Net further integrates with position embeddings, to optimize the gene expression and pathway activity prediction tasks. Through analyzing breast cancer and prostate cancer SRT datasets obtained from numerous tissue sections in multiple patients, we demonstrate that HOPE2Net can accurately predict the gene expression patterns for highly variable genes and the activities for significantly enriched domain-specific pathways. We further show that the predicted gene expression and pathway activities can help detect cancer subtypes and aid in treatment decision-makings. Given the growing interest in applying SRT in cancer genomics, we believe HOPE2Net holds the potential in identifying biomarkers from direct screenings of tissue histology images, which may be implemented in clinical studies for cancer diagnoses and decision-making processes.
Citation Format: Kenong Su, Minxing Pang, Mingyao Li. HOPE2Net: Integrating histological features and position embeddings in spatially resolved transcriptomics to predict gene expression and pathway activities from histology images in tumors [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2022; 2022 Apr 8-13. Philadelphia (PA): AACR; Cancer Res 2022;82(12_Suppl):Abstract nr 1218.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.