MOT: a Multi-Omics Transformer for multiclass classification tumour types predictions

Osseni, Mazid Abiodoun; Tossou, Prudencio; Laviolette, François; Corbeil, Jacques

doi:10.1101/2022.11.14.516459

Cited by 3 publications

(3 citation statements)

References 42 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, incorporating multi-omics data and their crosstalk information in a Transformer is very challenging: when processing multi-omics data, the multi-modal features are usually multiplied by tens of thousands of genes, producing an extremely long input that is not acceptable by a common Transformer model (usually <512 words). Meanwhile, certain embedding methods for biological data, such as discretization and linear transformation, were introduced in the previous Transformer models ( Osseni et al 2022 , Cui et al 2023 , Theodoris et al 2023 ), while biological information was largely lost during these kinds of embedding.…”

Section: Introductionmentioning

confidence: 99%

Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data

Liu,

Tao,

Cai

et al. 2024

Bioinformatics

View full text Add to dashboard Cite

Motivation Multi-omics data provide a comprehensive view of gene regulation at multiple levels, which is helpful in achieving accurate diagnosis of complex diseases like cancer. However, conventional integration methods rarely utilize prior biological knowledge and lack interpretability. Results To integrate various multi-omics data of tissue and liquid biopsies for disease diagnosis and prognosis, we developed a biological pathway informed Transformer, Pathformer. It embeds multi-omics input with a compacted multi-modal vector and a pathway-based sparse neural network. Pathformer also leverages criss-cross attention mechanism to capture the crosstalk between different pathways and modalities. We first benchmarked Pathformer with 18 comparable methods on multiple cancer datasets, where Pathformer outperformed all the other methods, with an average improvement of 6.3%-14.7% in F1 score for cancer survival prediction, 5.1%-12% for cancer stage prediction, and 8.1%-13.6% for cancer drug response prediction. Subsequently, for cancer prognosis prediction based on tissue multi-omics data, we used a case study to demonstrate the biological interpretability of Pathformer by identifying key pathways and their biological crosstalk. Then, for cancer early diagnosis based on liquid biopsy data, we used plasma and platelet datasets to demonstrate Pathformer’s potential of clinical applications in cancer screening. Moreover, we revealed deregulation of interesting pathways (e.g., scavenger receptor pathway) and their crosstalk in cancer patients’ blood, providing potential candidate targets for cancer microenvironment study. Availability Pathformer is implemented and freely available at https://github.com/lulab/Pathformer. Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Section: Introductionmentioning

confidence: 99%

Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data

Liu,

Tao,

Cai

et al. 2024

Bioinformatics

View full text Add to dashboard Cite

show abstract

“…However, incorporating multi-omics data and their crosstalk information in a Transformer is very challenging: when processing multi-omics data, the multi-modal features are usually multiplied by the number of genes (tens of thousands), producing an extremely long input that is not acceptable by a common Transformer model (usually less than 512). Meanwhile, certain embedding methods for biological data, such as discretization and linear transformation, were introduced in the previous Transformer models [17][18][19] , while biological information was largely lost during these kinds of embedding.…”

Section: Introductionmentioning

confidence: 99%

“…The crisscross attention mechanism of Transformer would be very useful to capture the crosstalk information 15 . However, the current embedding methods for biological data in Transformer models [16][17][18] , such as discretization and linear transformation, usually missed the biological information. Meanwhile, multi-modal features were typically multiplied by the number of genes, resulting in long input sequences, which could strain available memory resources or necessitate a feature selection step 18,19 .…”

Section: Introductionmentioning

confidence: 99%

Pathformer: a biological pathway informed Transformer integrating multi-omics data for disease diagnosis and prognosis

Liu

Tao

Cai

et al. 2023

Preprint

View full text Add to dashboard Cite

Multi-modal biological data integration can provide comprehensive views of gene regulation and cell development. However, conventional integration methods rarely utilize prior biological knowledge and lack interpretability. To address these challenges, we developed Pathformer, a biological pathway informed deep learning model based on Transformer with bias to integrate multi-modal data. Pathformer leverages criss-cross attention mechanism to capture crosstalk between different biological pathways and between different modalities (i.e., multi-omics). It also utilizes SHapley Additive Explanation method to reveal key pathways, genes, and regulatory mechanisms. Through benchmark studies on 28 TCGA datasets, we demonstrated the superior performance and interpretability of Pathformer on various cancer classification tasks, compared to other integration models. Furthermore, we applied Pathformer to liquid biopsy multi-modal data integration with high accuracy in cancer diagnosis. Meanwhile, Pathformer revealed interesting molecularly altered pathways in cancer patients’ body fluid, such as ligand binding of scavenger receptors, iron transport, and DAP12 signaling transmission, which are related to extracellular vesicle transport, platelet, and immune response.

show abstract

Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review

Alharbi

Vakanski

2023

Bioengineering

View full text Add to dashboard Cite

Cancer is a term that denotes a group of diseases caused by the abnormal growth of cells that can spread in different parts of the body. According to the World Health Organization (WHO), cancer is the second major cause of death after cardiovascular diseases. Gene expression can play a fundamental role in the early detection of cancer, as it is indicative of the biochemical processes in tissue and cells, as well as the genetic characteristics of an organism. Deoxyribonucleic acid (DNA) microarrays and ribonucleic acid (RNA)-sequencing methods for gene expression data allow quantifying the expression levels of genes and produce valuable data for computational analysis. This study reviews recent progress in gene expression analysis for cancer classification using machine learning methods. Both conventional and deep learning-based approaches are reviewed, with an emphasis on the application of deep learning models due to their comparative advantages for identifying gene patterns that are distinctive for various types of cancers. Relevant works that employ the most commonly used deep neural network architectures are covered, including multi-layer perceptrons, as well as convolutional, recurrent, graph, and transformer networks. This survey also presents an overview of the data collection methods for gene expression analysis and lists important datasets that are commonly used for supervised machine learning for this task. Furthermore, we review pertinent techniques for feature engineering and data preprocessing that are typically used to handle the high dimensionality of gene expression data, caused by a large number of genes present in data samples. The paper concludes with a discussion of future research directions for machine learning-based gene expression analysis for cancer classification.

show abstract

MOT: a Multi-Omics Transformer for multiclass classification tumour types predictions

Cited by 3 publications

References 42 publications

Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data

Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data

Pathformer: a biological pathway informed Transformer integrating multi-omics data for disease diagnosis and prognosis

Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review

Contact Info

Product

Resources

About