The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach.
Context Despite gut microbiome being widely studied in metabolic diseases, its role in polycystic ovary syndrome (PCOS) has been scarcely investigated. Objective Compare the gut microbiome in late fertile age women with and without PCOS and investigate whether changes in the gut microbiome correlate with PCOS-related metabolic parameters. Design Prospective, case-control study using the Northern Finland Birth Cohort 1966. Setting General community. Participants 102 PCOS women and 201 age- and body mass index (BMI)-matched non-PCOS control women. Clinical and biochemical characteristics of the participants were assessed at ages 31 and 46 and analyzed in the context of gut microbiome data at the age of 46. Intervention(s) None Main outcome measure(s) Bacterial diversity, relative abundance, and correlations with PCOS-related metabolic measures. Results Bacterial diversity indices did not differ significantly between PCOS and controls (Shannon diversity p = 0.979, unweighted UniFrac p = 0.175). Four genera whose balance helps to differentiate between PCOS and non-PCOS were identified. In the whole cohort, the abundance of two genera from Clostridiales, Ruminococcaceae UCG-002 and Clostridiales Family XIII AD3011 group, were correlated with several PCOS-related markers. Prediabetic PCOS women had significantly lower alpha diversity (Shannon diversity p = 0.018) and markedly increased abundance of genus Dorea (FDR = 0.03) compared to women with normal glucose tolerance. Conclusion PCOS and non-PCOS women at late fertile age with similar BMI do not significantly differ in their gut microbial profiles. However, there are significant microbial changes in PCOS individuals depending on their metabolic health.
Microbiome research is starting to move beyond the exploratory phase towards interventional trials and therefore well-characterized cohorts will be instrumental for generating hypotheses and providing new knowledge. As part of the Estonian Biobank, we established the Estonian Microbiome Cohort which includes stool, oral and plasma samples from 2509 participants and is supplemented with multi-omic measurements, questionnaires, and regular linkages to national electronic health records. Here we analyze stool data from deep metagenomic sequencing together with rich phenotyping, including 71 diseases, 136 medications, 21 dietary questions, 5 medical procedures, and 19 other factors. We identify numerous relationships (n = 3262) with different microbiome features. In this study, we extend the understanding of microbiome-host interactions using electronic health data and show that long-term antibiotic usage, independent from recent administration, has a significant impact on the microbiome composition, partly explaining the common associations between diseases.
The incidence of type 2 diabetes (T2D) has been increasing globally, and a growing body of evidence links type 2 diabetes with altered microbiota composition. Type 2 diabetes is preceded by a long prediabetic state characterized by changes in various metabolic parameters. We tested whether the gut microbiome could have predictive potential for T2D development during the healthy and prediabetic disease stages. We used prospective data of 608 well-phenotyped Finnish men collected from the population-based Metabolic Syndrome in Men (METSIM) study to build machine learning models for predicting continuous glucose and insulin measures in a shorter (1.5 year) and longer (4 year) period. Our results show that the inclusion of the gut microbiome improves prediction accuracy for modeling T2D-associated parameters such as glycosylated hemoglobin and insulin measures. We identified novel microbial biomarkers and described their effects on the predictions using interpretable machine learning techniques, which revealed complex linear and nonlinear associations. Additionally, the modeling strategy carried out allowed us to compare the stability of model performance and biomarker selection, also revealing differences in short-term and long-term predictions. The identified microbiome biomarkers provide a predictive measure for various metabolic traits related to T2D, thus providing an additional parameter for personal risk assessment. Our work also highlights the need for robust modeling strategies and the value of interpretable machine learning. IMPORTANCE Recent studies have shown a clear link between gut microbiota and type 2 diabetes. However, current results are based on cross-sectional studies that aim to determine the microbial dysbiosis when the disease is already prevalent. In order to consider the microbiome as a factor in disease risk assessment, prospective studies are needed. Our study is the first study that assesses the gut microbiome as a predictive measure for several type 2 diabetes-associated parameters in a longitudinal study setting. Our results revealed a number of novel microbial biomarkers that can improve the prediction accuracy for continuous insulin measures and glycosylated hemoglobin levels. These results make the prospect of using the microbiome in personalized medicine promising.
Colorectal cancer (CRC) is a challenging public health problem which successful treatment depends on the stage at diagnosis. Recently, CRC-specific microbiome signatures have been proposed as a marker for CRC detection. Since many countries have initiated CRC screening programs, it would be useful to analyze the microbiome in the samples collected in fecal immunochemical test (FIT) tubes for fecal occult blood testing. Therefore, we investigated the impact of FIT tubes and stabilization buffer on the microbial community structure evaluated in stool samples from 30 volunteers and compared the detected communities to those of fresh-frozen samples, highlighting previously published cancer-specific communities. Altogether, 214 samples were analyzed by 16S rRNA gene sequencing, including positive and negative controls. Our results indicated that the variation between individuals was greater than the differences introduced by the collection strategy. The vast majority of the genera were stable for up to 7 days. None of the changes observed between fresh-frozen samples and FIT tube specimens were related to previously identified CRC-specific bacteria. Overall, we show that FIT tubes can be used for profiling the microbiota in CRC screening programs. This circumvents the need to collect additional samples and can possibly improve the sensitivity of CRC detection.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.