Abstract:AbstractOral mucositis (OM) is a common debilitating dose-limiting toxicity of cancer treatment, including hematopoietic stem cell transplantation (HSCT). We hypothesized that the oral microbiome is disturbed during allogeneic HSCT, partially accounting for the variability in OM severity. Using 16S ribosomal RNA gene sequence analysis, metabolomic profiling, and computational methods, we characterized the behavior of the salivary microbiome and metabolome of 184… Show more
“…Our results suggest that comparing the absolute feature frequencies (or relative frequencies) in different experiments leads to lower accuracy than measuring the fold change, as expressed by differences in the logged frequencies. Indeed, in most of our recent results [14, 16, 21, 32], we found that such a log normalization is essential.…”
Section: Discussionmentioning
confidence: 72%
“…4 Do not perform other dimension reductions. results [14,16,21,32], we found that such a log normalization is essential.…”
Section: Resultsmentioning
confidence: 80%
“…Let us follow the analysis for an Artificial Neural Network (ANN) based prediction of the emergence of Mucositis following bone marrow transplant in leukemic patients [21]. We present the differences between the configuration in every preprocessing steps through their influence on the prediction precision.…”
Section: Example On Mucositis Prediction Prognosis From Pre-transplanmentioning
Abstract16S sequencing results are often used for Machine Learning (ML) tasks. 16S gene sequences are represented as feature counts, which are associated with taxonomic representation. Raw feature counts may not be the optimal representation for ML. We checked multiple preprocessing steps and tested the optimal combination for 16S sequencing-based classification tasks. We computed the contribution of each step to the accuracy as measured by the Area Under Curve (AUC) of the classification. We show that the log of the feature counts is much more informative than the relative counts. We further show that merging features associated with the same taxonomy at a given level, through a dimension reduction step for each group of bacteria improves the AUC. Finally, we show that z-scoring has a very limited effect on the results. These preprocessing steps are integrated into the MIPMLP - Microbiome Preprocessing Machine Learning Pipeline, which is available as a stand alone version at https://github.com/louzounlab/microbiome/tree/master/Preprocess or as a service at http://mip-mlp.math.biu.ac.il/HomeImportanceMicrobiome composition has been proposed as a biomarker (mic-marker) for multiple diseases. However, a clear analysis of the optimal way to represent the gene sequence counts is still lacking.We propose a simple and straight forward method that significantly improves the accuracy of mic-marker studies.This method can be of use to merge two of the most important advances in biology in the last decade: Microbiome analysis, and the introduction of machine learning methods to biological studies.
“…Our results suggest that comparing the absolute feature frequencies (or relative frequencies) in different experiments leads to lower accuracy than measuring the fold change, as expressed by differences in the logged frequencies. Indeed, in most of our recent results [14, 16, 21, 32], we found that such a log normalization is essential.…”
Section: Discussionmentioning
confidence: 72%
“…4 Do not perform other dimension reductions. results [14,16,21,32], we found that such a log normalization is essential.…”
Section: Resultsmentioning
confidence: 80%
“…Let us follow the analysis for an Artificial Neural Network (ANN) based prediction of the emergence of Mucositis following bone marrow transplant in leukemic patients [21]. We present the differences between the configuration in every preprocessing steps through their influence on the prediction precision.…”
Section: Example On Mucositis Prediction Prognosis From Pre-transplanmentioning
Abstract16S sequencing results are often used for Machine Learning (ML) tasks. 16S gene sequences are represented as feature counts, which are associated with taxonomic representation. Raw feature counts may not be the optimal representation for ML. We checked multiple preprocessing steps and tested the optimal combination for 16S sequencing-based classification tasks. We computed the contribution of each step to the accuracy as measured by the Area Under Curve (AUC) of the classification. We show that the log of the feature counts is much more informative than the relative counts. We further show that merging features associated with the same taxonomy at a given level, through a dimension reduction step for each group of bacteria improves the AUC. Finally, we show that z-scoring has a very limited effect on the results. These preprocessing steps are integrated into the MIPMLP - Microbiome Preprocessing Machine Learning Pipeline, which is available as a stand alone version at https://github.com/louzounlab/microbiome/tree/master/Preprocess or as a service at http://mip-mlp.math.biu.ac.il/HomeImportanceMicrobiome composition has been proposed as a biomarker (mic-marker) for multiple diseases. However, a clear analysis of the optimal way to represent the gene sequence counts is still lacking.We propose a simple and straight forward method that significantly improves the accuracy of mic-marker studies.This method can be of use to merge two of the most important advances in biology in the last decade: Microbiome analysis, and the introduction of machine learning methods to biological studies.
“…Many AML patients will go on to develop oral or dental complications from their cancer treatment, and the composition of the oral microbiome plays a role in determining this risk. In particular, the oral microbial composition has been shown to be associated with the development of oral mucositis, which is characterized by ulcerative lesions in the mouth, in hematopoietic stem cell transplantation patients with hematologic malignancies [ 37 , 38 ]. Microbiome risk factors have also been associated with the development of oral candidiasis, which is an infection of the oral cavity, during cancer chemotherapy [ 39 ].…”
Background
The estimation of microbial networks can provide important insight into the ecological relationships among the organisms that comprise the microbiome. However, there are a number of critical statistical challenges in the inference of such networks from high-throughput data. Since the abundances in each sample are constrained to have a fixed sum and there is incomplete overlap in microbial populations across subjects, the data are both compositional and zero-inflated.
Results
We propose the COmpositional Zero-Inflated Network Estimation (COZINE) method for inference of microbial networks which addresses these critical aspects of the data while maintaining computational scalability. COZINE relies on the multivariate Hurdle model to infer a sparse set of conditional dependencies which reflect not only relationships among the continuous values, but also among binary indicators of presence or absence and between the binary and continuous representations of the data. Our simulation results show that the proposed method is better able to capture various types of microbial relationships than existing approaches. We demonstrate the utility of the method with an application to understanding the oral microbiome network in a cohort of leukemic patients.
Conclusions
Our proposed method addresses important challenges in microbiome network estimation, and can be effectively applied to discover various types of dependence relationships in microbial communities. The procedure we have developed, which we refer to as COZINE, is available online at https://github.com/MinJinHa/COZINE.
“…Shouval et al demonstrated that also oral microbiome-derived metabolites are altered in patients developing oral mucositis during HSCT. They analyzed the salivary metabolic profile of patients with and without severe oral mucositis, showing a reduction in N-acetylputrescine and agmatine, metabolites involved in the polyamine pathway [ 30 ]. Polyamines are small polycationic molecules produced by commensal bacteria with a wide array of biological functions including preservation of mucosal barrier integrity [ 77 ].…”
Section: Polyamines and Breath Metabolitesmentioning
The gut microbiome has emerged as a major character in the context of hematopoietic stem cell transplantation. The biology underpinning this relationship is still to be defined. Recently, mounting evidence has suggested a role for microbiome-derived metabolites in mediating crosstalk between intestinal microbial communities and the host. Some of these metabolites, such as fiber-derived short-chain fatty acids or amino acid-derived compounds, were found to have a role also in the transplant setting. New interesting data have been published on this topic, posing a new intriguing perspective on comprehension and treatment. This review provides an updated comprehensive overview of the available evidence in the field of gut microbiome-derived metabolites and hematopoietic stem cell transplantation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.