Flexible Data Trimming Improves Performance of Global Machine Learning Methods in Omics-Based Personalized Oncology

Tkachev, Victor; Sorokin, Maxim; Borisov, Constantin; Garazha, Andrew; Buzdin, Anton; Borisov, Nicolas

doi:10.3390/ijms21030713

Cited by 19 publications

(19 citation statements)

References 69 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several methods were proposed for the assessment of drug efficiency based on gene/protein expression [ 16 , 17 , 18 , 19 ] or mutation patterns [ 20 , 21 , 22 ]. Unfortunately, most such methods are either proprietary or employ machine learning on preceding cases [ 23 , 24 , 25 , 26 ]. So, for evaluating a cannabis drug’s individual action, we have suggested a novel approach, the cannabis drug efficiency index (CDEI).…”

Section: Methodsmentioning

confidence: 99%

System, Method and Software for Calculation of a Cannabis Drug Efficiency Index for the Reduction of Inflammation

Borisov

Ilnytskyy

Byeon

et al. 2020

IJMS

Self Cite

View full text Add to dashboard Cite

There are many varieties of Cannabis sativa that differ from each other by composition of cannabinoids, terpenes and other molecules. The medicinal properties of these cultivars are often very different, with some being more efficient than others. This report describes the development of a method and software for the analysis of the efficiency of various cannabis extracts to detect the anti-inflammatory properties of the various cannabis extracts. The method uses high-throughput gene expression profiling data but can potentially use other omics data as well. According to the signaling pathway topology, the gene expression profiles are convoluted into the signaling pathway activities using a signaling pathway impact analysis (SPIA) method. The method was tested by inducing inflammation in human 3D epithelial tissues, including intestine, oral and skin, and then exposing these tissues to various extracts and then performing transcriptome analysis. The analysis showed a different efficiency of the various extracts in restoring the transcriptome changes to the pre-inflammation state, thus allowing to calculate a different cannabis drug efficiency index (CDEI).

show abstract

Section: Methodsmentioning

confidence: 99%

System, Method and Software for Calculation of a Cannabis Drug Efficiency Index for the Reduction of Inflammation

Borisov

Ilnytskyy

Byeon

et al. 2020

IJMS

Self Cite

View full text Add to dashboard Cite

show abstract

“…Among these three categories, reinforcement learning is relatively less used for multi-omics data analysis. Developing the methodologies is an active area of research ( 21 – 25 ). Pan-cancer analysis is also being done.…”

Section: Introductionmentioning

confidence: 99%

Artificial Intelligence (AI)-Based Systems Biology Approaches in Multi-Omics Data Analysis of Cancer

Biswas

Chakrabarti

2020

Front. Oncol.

View full text Add to dashboard Cite

Cancer is the manifestation of abnormalities of different physiological processes involving genes, DNAs, RNAs, proteins, and other biomolecules whose profiles are reflected in different omics data types. As these bio-entities are very much correlated, integrative analysis of different types of omics data, multi-omics data, is required to understanding the disease from the tumorigenesis to the disease progression. Artificial intelligence (AI), specifically machine learning algorithms, has the ability to make decisive interpretation of “big”-sized complex data and, hence, appears as the most effective tool for the analysis and understanding of multi-omics data for patient-specific observations. In this review, we have discussed about the recent outcomes of employing AI in multi-omics data analysis of different types of cancer. Based on the research trends and significance in patient treatment, we have primarily focused on the AI-based analysis for determining cancer subtypes, disease prognosis, and therapeutic targets. We have also discussed about AI analysis of some non-canonical types of omics data as they have the capability of playing the determiner role in cancer patient care. Additionally, we have briefly discussed about the data repositories because of their pivotal role in multi-omics data storing, processing, and analysis.

show abstract

“…Many ML methods may be used for such applications, e.g. decision trees [12,13], random forests, RF [14,15], linear [16], logistic [17], lasso [18,19], ridge [15,20] regressions, multi-layer perceptron, MLP [12,15,21,22], support vectors machines [12,13,15,[23][24][25], adaptive boosting [26][27][28], as well as binomial naïve Bayesian [15] method.…”

Section: Introductionmentioning

confidence: 99%

“…Intelligent data filtering is, therefore, needed to reduce dimensionality of data [8]. However, a recent approach using dynamic feature extraction, or flexible data trimming, can significantly improve performances of ML-based methods for the real-world datasets [15,25].…”

Section: Introductionmentioning

confidence: 99%

Cancer gene expression profiles associated with clinical outcomes to chemotherapy treatments

Borisov

Sorokin

Tkachev³

et al. 2020

BMC Med Genomics

Self Cite

View full text Add to dashboard Cite

Background Machine learning (ML) methods still have limited applicability in personalized oncology due to low numbers of available clinically annotated molecular profiles. This doesn’t allow sufficient training of ML classifiers that could be used for improving molecular diagnostics. Methods We reviewed published datasets of high throughput gene expression profiles corresponding to cancer patients with known responses on chemotherapy treatments. We browsed Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA) and Tumor Alterations Relevant for GEnomics-driven Therapy (TARGET) repositories. Results We identified data collections suitable to build ML models for predicting responses on certain chemotherapeutic schemes. We identified 26 datasets, ranging from 41 till 508 cases per dataset. All the datasets identified were checked for ML applicability and robustness with leave-one-out cross validation. Twenty-three datasets were found suitable for using ML that had balanced numbers of treatment responder and non-responder cases. Conclusions We collected a database of gene expression profiles associated with clinical responses on chemotherapy for 2786 individual cancer cases. Among them seven datasets included RNA sequencing data (for 645 cases) and the others – microarray expression profiles. The cases represented breast cancer, lung cancer, low-grade glioma, endothelial carcinoma, multiple myeloma, adult leukemia, pediatric leukemia and kidney tumors. Chemotherapeutics included taxanes, bortezomib, vincristine, trastuzumab, letrozole, tipifarnib, temozolomide, busulfan and cyclophosphamide.

show abstract

Flexible Data Trimming Improves Performance of Global Machine Learning Methods in Omics-Based Personalized Oncology

Cited by 19 publications

References 69 publications

System, Method and Software for Calculation of a Cannabis Drug Efficiency Index for the Reduction of Inflammation

System, Method and Software for Calculation of a Cannabis Drug Efficiency Index for the Reduction of Inflammation

Artificial Intelligence (AI)-Based Systems Biology Approaches in Multi-Omics Data Analysis of Cancer

Cancer gene expression profiles associated with clinical outcomes to chemotherapy treatments

Contact Info

Product

Resources

About